A/B Tests
Create, deploy, and analyze A/B tests that split inference traffic across model variants. Supports weighted-random, feature-based, multi-armed-bandit, and canary strategies, plus formal experiments with statistical significance testing.
All endpoints require authentication via X-API-Key header and the appropriate scope.
A/B Test Object
{
"id": "ab_abc123",
"name": "Llama vs Mistral routing",
"description": "Compare Llama-3 70B against Mistral-Large on customer support traffic",
"strategy": "weighted_random",
"status": "running",
"variants": [
{
"variantId": "var_control",
"modelId": "model_llama3_70b",
"weight": 50,
"enabled": true,
"isControl": true
},
{
"variantId": "var_treatment",
"modelId": "model_mistral_large",
"weight": 50,
"enabled": true,
"isControl": false
}
],
"tags": ["routing", "production"],
"featureRules": [],
"banditConfig": null,
"canaryConfig": null,
"workspaceId": "ws_abc123",
"organizationId": "org_xyz",
"ownerId": "user_456",
"createdAt": "2025-01-15T10:30:00Z",
"updatedAt": "2025-02-01T14:22:00Z"
}
GET /api/v1/ab-tests
List A/B tests accessible to the authenticated user.
Scope: mlops:read
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
status | string | No | Filter by status: draft, active, running, stopped |
strategy | string | No | Filter by strategy: weighted_random, feature_based, multi_armed_bandit, canary |
workspaceId | string | No | Filter by workspace ID |
Response 200 OK
{
"count": 2,
"abTests": [
{
"id": "ab_abc123",
"name": "Llama vs Mistral routing",
"strategy": "weighted_random",
"status": "running",
"variants": [],
"createdAt": "2025-01-15T10:30:00Z"
}
]
}
POST /api/v1/ab-tests
Create a new A/B test. Requires at least 2 model variants.
Scope: mlops:write
Request Body
{
"name": "Llama vs Mistral routing",
"strategy": "weighted_random",
"variants": [
{ "variantId": "var_control", "modelId": "model_llama3_70b", "weight": 50, "isControl": true },
{ "variantId": "var_treatment", "modelId": "model_mistral_large", "weight": 50 }
],
"description": "Compare Llama-3 70B against Mistral-Large",
"tags": ["routing"],
"featureRules": [],
"banditConfig": null,
"canaryConfig": null
}
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | A/B test name |
strategy | string | Yes | Routing strategy: weighted_random, feature_based, multi_armed_bandit, canary |
variants | array | Yes | Array of { variantId, modelId, weight?, isControl? } — at least 2 required |
description | string | No | A/B test description |
tags | array | No | Tags for organization |
featureRules | array | No | Feature-based routing rules (for feature_based strategy) |
banditConfig | object | No | Multi-armed bandit config: { explorationRate, rewardMetric, windowSize } |
canaryConfig | object | No | Canary config: { initialWeight, incrementStep, successThreshold, rollbackThreshold } |
Response 201 Created
{
"abTestId": "ab_abc123"
}
GET /api/v1/ab-tests/:id
Get full details of an A/B test.
Scope: mlops:read
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Response 200 OK
Returns the full A/B Test object.
PUT /api/v1/ab-tests/:id
Update an A/B test's name, description, or tags.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Request Body
{
"name": "Updated routing test",
"description": "Updated description",
"tags": ["routing", "v2"]
}
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | New A/B test name |
description | string | No | New description |
tags | array | No | Updated tags |
Response 200 OK
{
"updated": true
}
DELETE /api/v1/ab-tests/:id
Soft-delete an A/B test.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Response 204 No Content
POST /api/v1/ab-tests/:id/deploy
Deploy an A/B test to production via the AI Gateway. Creates a routing endpoint that distributes inference requests across model variants based on the configured strategy.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Response 200 OK
{
"deployed": true
}
POST /api/v1/ab-tests/:id/stop
Stop a running A/B test deployment.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Response 200 OK
{
"stopped": true
}
POST /api/v1/ab-tests/:id/start
Start a stopped A/B test (re-deploys it).
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Response 200 OK
{
"started": true
}
PUT /api/v1/ab-tests/:id/variants/:sub/weight
Update the traffic weight for a specific variant.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
sub | string | Yes | Variant ID |
Request Body
{
"weight": 70
}
| Field | Type | Required | Description |
|---|---|---|---|
weight | number | Yes | New weight (0–100, percentage of traffic) |
Response 200 OK
{
"updated": true
}
PUT /api/v1/ab-tests/:id/variants/:sub/toggle
Enable or disable a variant. At least 2 variants must remain enabled.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
sub | string | Yes | Variant ID |
Request Body
{
"enabled": false
}
| Field | Type | Required | Description |
|---|---|---|---|
enabled | boolean | Yes | Enable (true) or disable (false) the variant |
Response 200 OK
{
"toggled": true
}
GET /api/v1/ab-tests/:id/metrics
Get A/B test metrics: total requests, success rate, average latency, and per-variant breakdowns.
Scope: mlops:read
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
startDate | string | No | Filter start date (ISO 8601) |
endDate | string | No | Filter end date (ISO 8601) |
Response 200 OK
{
"totalRequests": 12450,
"successRate": 0.987,
"averageLatencyMs": 312,
"variants": [
{
"variantId": "var_control",
"requests": 6230,
"successRate": 0.985,
"averageLatencyMs": 320
},
{
"variantId": "var_treatment",
"requests": 6220,
"successRate": 0.989,
"averageLatencyMs": 304
}
]
}
GET /api/v1/ab-tests/:id/predictions
Get prediction logs routed through this A/B test.
Scope: mlops:read
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID |
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
limit | integer | No | Max results |
offset | integer | No | Pagination offset |
variantId | string | No | Filter by specific variant |
Response 200 OK
{
"count": 50,
"predictions": [
{
"predictionId": "pred_abc001",
"variantId": "var_control",
"modelId": "model_llama3_70b",
"latencyMs": 310,
"reward": 0.92,
"createdAt": "2025-02-01T14:22:00Z"
}
]
}
POST /api/v1/ab-tests/predictions/:id/feedback
Record feedback/reward for a routed prediction. Essential for multi-armed-bandit strategy.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Prediction ID |
Request Body
{
"reward": 0.92,
"label": "correct"
}
| Field | Type | Required | Description |
|---|---|---|---|
reward | number | No | Reward score (0–1 for bandit optimization) |
label | string | No | Feedback label (e.g. correct/incorrect) |
Response 200 OK
{
"recorded": true
}
POST /api/v1/ab-tests/:id/experiments
Create a formal A/B testing experiment. Defines a control variant (baseline) and treatment variants to compare with statistical significance testing.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | A/B test ID (used as routerId) |
Request Body
{
"name": "Latency reduction experiment",
"controlVariantId": "var_control",
"treatmentVariantIds": ["var_treatment"],
"description": "Test if Mistral reduces latency vs Llama-3",
"hypothesis": "Mistral-Large will reduce p95 latency by 15%",
"primaryMetric": "latency",
"confidenceLevel": 0.95,
"minimumDetectableEffect": 0.05,
"minSamplePerVariant": 1000
}
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Experiment name |
controlVariantId | string | Yes | Control (baseline) variant ID |
treatmentVariantIds | array | Yes | Array of treatment variant IDs to compare against control |
description | string | No | Experiment description |
hypothesis | string | No | What you expect to happen |
primaryMetric | string | No | Primary metric to measure (e.g. latency, accuracy, reward) |
confidenceLevel | number | No | Statistical confidence level (default 0.95) |
minimumDetectableEffect | number | No | Minimum effect size to detect |
minSamplePerVariant | number | No | Minimum samples per variant before analysis |
Response 201 Created
{
"experimentId": "exp_abc001"
}
POST /api/v1/ab-tests/experiments/:id/start
Start collecting data for an A/B testing experiment.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Experiment ID |
Response 200 OK
{
"experimentId": "exp_abc001",
"status": "running",
"startedAt": "2025-02-01T14:22:00Z"
}
POST /api/v1/ab-tests/experiments/:id/analyze
Run statistical analysis on an experiment. Returns p-values, confidence intervals, and whether the treatment outperforms the control.
Scope: mlops:read
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Experiment ID |
Response 200 OK
{
"experimentId": "exp_abc001",
"controlMean": 320.1,
"treatmentMean": 304.5,
"pValue": 0.013,
"confidenceInterval": [-22.4, -8.7],
"significant": true,
"winner": "var_treatment"
}
POST /api/v1/ab-tests/experiments/:id/stop
Stop a running experiment. Freezes data collection.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Experiment ID |
Response 200 OK
{
"experimentId": "exp_abc001",
"status": "stopped",
"stoppedAt": "2025-02-05T10:00:00Z"
}
DELETE /api/v1/ab-tests/experiments/:id
Delete an A/B testing experiment and its data.
Scope: mlops:write
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Experiment ID |
Response 204 No Content