A/B Tests

Create, deploy, and analyze A/B tests that split inference traffic across model variants. Supports weighted-random, feature-based, multi-armed-bandit, and canary strategies, plus formal experiments with statistical significance testing.

All endpoints require authentication via X-API-Key header and the appropriate scope.

A/B Test Object

{
  "id": "ab_abc123",
  "name": "Llama vs Mistral routing",
  "description": "Compare Llama-3 70B against Mistral-Large on customer support traffic",
  "strategy": "weighted_random",
  "status": "running",
  "variants": [
    {
      "variantId": "var_control",
      "modelId": "model_llama3_70b",
      "weight": 0.5,
      "enabled": true,
      "isControl": true
    },
    {
      "variantId": "var_treatment",
      "modelId": "model_mistral_large",
      "weight": 0.5,
      "enabled": true,
      "isControl": false
    }
  ],
  "tags": ["routing", "production"],
  "featureRules": [],
  "banditConfig": null,
  "canaryConfig": null,
  "workspaceId": "ws_abc123",
  "organizationId": "org_xyz",
  "ownerId": "user_456",
  "createdAt": "2025-01-15T10:30:00Z",
  "updatedAt": "2025-02-01T14:22:00Z"
}

`GET` /api/v1/ab-tests

List A/B tests accessible to the authenticated user.

Scope: mlops:read

Query Parameters

Parameter	Type	Required	Description
`status`	string	No	Filter by status: `draft`, `active`, `running`, `stopped`
`strategy`	string	No	Filter by strategy: `weighted_random`, `feature_based`, `multi_armed_bandit`, `canary`
`workspaceId`	string	No	Filter by workspace ID

Response 200 OK

{
  "count": 2,
  "abTests": [
    {
      "id": "ab_abc123",
      "name": "Llama vs Mistral routing",
      "strategy": "weighted_random",
      "status": "running",
      "variants": [],
      "createdAt": "2025-01-15T10:30:00Z"
    }
  ]
}

`POST` /api/v1/ab-tests

Create a new A/B test. Requires at least 2 model variants.

Scope: mlops:write

Request Body

{
  "name": "Llama vs Mistral routing",
  "strategy": "weighted_random",
  "variants": [
    { "variantId": "var_control", "modelId": "model_llama3_70b", "weight": 0.5, "isControl": true },
    { "variantId": "var_treatment", "modelId": "model_mistral_large", "weight": 0.5 }
  ],
  "description": "Compare Llama-3 70B against Mistral-Large",
  "tags": ["routing"],
  "featureRules": [],
  "banditConfig": null,
  "canaryConfig": null
}

Field	Type	Required	Description
`name`	string	Yes	A/B test name
`strategy`	string	Yes	Routing strategy: `weighted_random`, `feature_based`, `multi_armed_bandit`, `canary`
`variants`	array	Yes	Array of `{ variantId, modelId, weight?, isControl? }` — at least 2 required
`description`	string	No	A/B test description
`tags`	array	No	Tags for organization
`featureRules`	array	No	Feature-based routing rules (for `feature_based` strategy)
`banditConfig`	object	No	Multi-armed bandit config: `{ explorationRate, rewardMetric, windowSize }`
`canaryConfig`	object	No	Canary config: `{ initialWeight, incrementStep, successThreshold, rollbackThreshold }`

Response 201 Created

{
  "abTestId": "ab_abc123"
}

`GET` /api/v1/ab-tests/:id

Get full details of an A/B test.

Scope: mlops:read

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Response 200 OK

Returns the full A/B Test object.

`PUT` /api/v1/ab-tests/:id

Update an A/B test's name, description, or tags.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Request Body

{
  "name": "Updated routing test",
  "description": "Updated description",
  "tags": ["routing", "v2"]
}

Field	Type	Required	Description
`name`	string	No	New A/B test name
`description`	string	No	New description
`tags`	array	No	Updated tags

Response 200 OK

{
  "updated": true
}

`DELETE` /api/v1/ab-tests/:id

Soft-delete an A/B test.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Response 204 No Content

`POST` /api/v1/ab-tests/:id/deploy

Deploy an A/B test to production via the AI Gateway. Creates a routing endpoint that distributes inference requests across model variants based on the configured strategy.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Response 200 OK

{
  "deployed": true
}

`POST` /api/v1/ab-tests/:id/stop

Stop a running A/B test deployment.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Response 200 OK

{
  "stopped": true
}

`POST` /api/v1/ab-tests/:id/start

Start a stopped A/B test (re-deploys it).

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Response 200 OK

{
  "started": true
}

`PUT` /api/v1/ab-tests/:id/variants/:sub/weight

Update the traffic weight for a specific variant.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID
`sub`	string	Yes	Variant ID

Request Body

{
  "weight": 70
}

Field	Type	Required	Description
`weight`	number	Yes	New weight (0–100, percentage of traffic)

Response 200 OK

{
  "updated": true
}

`PUT` /api/v1/ab-tests/:id/variants/:sub/toggle

Enable or disable a variant. At least 2 variants must remain enabled.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID
`sub`	string	Yes	Variant ID

Request Body

{
  "enabled": false
}

Field	Type	Required	Description
`enabled`	boolean	Yes	Enable (`true`) or disable (`false`) the variant

Response 200 OK

{
  "toggled": true
}

`GET` /api/v1/ab-tests/:id/metrics

Get A/B test metrics: total requests, success rate, average latency, and per-variant breakdowns.

Scope: mlops:read

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Query Parameters

Parameter	Type	Required	Description
`startDate`	string	No	Filter start date (ISO 8601)
`endDate`	string	No	Filter end date (ISO 8601)

Response 200 OK

{
  "totalRequests": 12450,
  "successRate": 0.987,
  "averageLatencyMs": 312,
  "variants": [
    {
      "variantId": "var_control",
      "requests": 6230,
      "successRate": 0.985,
      "averageLatencyMs": 320
    },
    {
      "variantId": "var_treatment",
      "requests": 6220,
      "successRate": 0.989,
      "averageLatencyMs": 304
    }
  ]
}

`GET` /api/v1/ab-tests/:id/predictions

Get prediction logs routed through this A/B test.

Scope: mlops:read

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID

Query Parameters

Parameter	Type	Required	Description
`limit`	integer	No	Max results
`offset`	integer	No	Pagination offset
`variantId`	string	No	Filter by specific variant

Response 200 OK

{
  "count": 50,
  "predictions": [
    {
      "predictionId": "pred_abc001",
      "variantId": "var_control",
      "modelId": "model_llama3_70b",
      "latencyMs": 310,
      "reward": 0.92,
      "createdAt": "2025-02-01T14:22:00Z"
    }
  ]
}

`POST` /api/v1/ab-tests/predictions/:id/feedback

Record feedback/reward for a routed prediction. Essential for multi-armed-bandit strategy.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	Prediction ID

Request Body

{
  "reward": 0.92,
  "label": "correct"
}

Field	Type	Required	Description
`reward`	number	No	Reward score (0–1 for bandit optimization)
`label`	string	No	Feedback label (e.g. `correct`/`incorrect`)

Response 200 OK

{
  "recorded": true
}

`POST` /api/v1/ab-tests/:id/experiments

Create a formal A/B testing experiment. Defines a control variant (baseline) and treatment variants to compare with statistical significance testing.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	A/B test ID (used as `routerId`)

Request Body

{
  "name": "Latency reduction experiment",
  "controlVariantId": "var_control",
  "treatmentVariantIds": ["var_treatment"],
  "description": "Test if Mistral reduces latency vs Llama-3",
  "hypothesis": "Mistral-Large will reduce p95 latency by 15%",
  "primaryMetric": "latency",
  "confidenceLevel": 0.95,
  "minimumDetectableEffect": 0.05,
  "minSamplePerVariant": 1000
}

Field	Type	Required	Description
`name`	string	Yes	Experiment name
`controlVariantId`	string	Yes	Control (baseline) variant ID
`treatmentVariantIds`	array	Yes	Array of treatment variant IDs to compare against control
`description`	string	No	Experiment description
`hypothesis`	string	No	What you expect to happen
`primaryMetric`	string	No	Primary metric to measure (e.g. `latency`, `accuracy`, `reward`)
`confidenceLevel`	number	No	Statistical confidence level (default `0.95`)
`minimumDetectableEffect`	number	No	Minimum effect size to detect
`minSamplePerVariant`	number	No	Minimum samples per variant before analysis

Response 201 Created

{
  "experimentId": "exp_abc001"
}

`POST` /api/v1/ab-tests/experiments/:id/start

Start collecting data for an A/B testing experiment.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	Experiment ID

Response 200 OK

{
  "experimentId": "exp_abc001",
  "status": "running",
  "startedAt": "2025-02-01T14:22:00Z"
}

`POST` /api/v1/ab-tests/experiments/:id/analyze

Run statistical analysis on an experiment. Returns p-values, confidence intervals, and whether the treatment outperforms the control.

Scope: mlops:read

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	Experiment ID

Response 200 OK

{
  "experimentId": "exp_abc001",
  "controlMean": 320.1,
  "treatmentMean": 304.5,
  "pValue": 0.013,
  "confidenceInterval": [-22.4, -8.7],
  "significant": true,
  "winner": "var_treatment"
}

`POST` /api/v1/ab-tests/experiments/:id/stop

Stop a running experiment. Freezes data collection.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	Experiment ID

Response 200 OK

{
  "experimentId": "exp_abc001",
  "status": "stopped",
  "stoppedAt": "2025-02-05T10:00:00Z"
}

`DELETE` /api/v1/ab-tests/experiments/:id

Delete an A/B testing experiment and its data.

Scope: mlops:write

Path Parameters

Parameter	Type	Required	Description
`id`	string	Yes	Experiment ID

Response 204 No Content

A/B Test Object​

GET /api/v1/ab-tests​

POST /api/v1/ab-tests​

GET /api/v1/ab-tests/:id​

PUT /api/v1/ab-tests/:id​

DELETE /api/v1/ab-tests/:id​

POST /api/v1/ab-tests/:id/deploy​

POST /api/v1/ab-tests/:id/stop​

POST /api/v1/ab-tests/:id/start​

PUT /api/v1/ab-tests/:id/variants/:sub/weight​

PUT /api/v1/ab-tests/:id/variants/:sub/toggle​

GET /api/v1/ab-tests/:id/metrics​

GET /api/v1/ab-tests/:id/predictions​

POST /api/v1/ab-tests/predictions/:id/feedback​

POST /api/v1/ab-tests/:id/experiments​

POST /api/v1/ab-tests/experiments/:id/start​

POST /api/v1/ab-tests/experiments/:id/analyze​

POST /api/v1/ab-tests/experiments/:id/stop​

DELETE /api/v1/ab-tests/experiments/:id​

A/B Test Object

`GET` /api/v1/ab-tests

`POST` /api/v1/ab-tests

`GET` /api/v1/ab-tests/:id

`PUT` /api/v1/ab-tests/:id

`DELETE` /api/v1/ab-tests/:id

`POST` /api/v1/ab-tests/:id/deploy

`POST` /api/v1/ab-tests/:id/stop

`POST` /api/v1/ab-tests/:id/start

`PUT` /api/v1/ab-tests/:id/variants/:sub/weight

`PUT` /api/v1/ab-tests/:id/variants/:sub/toggle

`GET` /api/v1/ab-tests/:id/metrics

`GET` /api/v1/ab-tests/:id/predictions

`POST` /api/v1/ab-tests/predictions/:id/feedback

`POST` /api/v1/ab-tests/:id/experiments

`POST` /api/v1/ab-tests/experiments/:id/start

`POST` /api/v1/ab-tests/experiments/:id/analyze

`POST` /api/v1/ab-tests/experiments/:id/stop

`DELETE` /api/v1/ab-tests/experiments/:id