Skip to main content

A/B Testing Routers

A/B testing routers enable you to split traffic between multiple model variants, compare their performance, and make data-driven decisions about which model to deploy to production.

Why Use A/B Testing?

Machine learning models need continuous validation in production. A/B testing helps you:

  • Validate Model Changes: Before fully deploying a new model, test it against your current production model with real traffic
  • Measure Impact: Quantify the difference in accuracy, latency, and business metrics between model versions
  • Reduce Risk: Gradually roll out new models rather than replacing everything at once
  • Optimize Continuously: Use multi-armed bandit algorithms to automatically route more traffic to better-performing models

Routing Strategies

Weighted Random

Split traffic randomly based on configured percentages. For example, route 80% of requests to the control model and 20% to a new variant.

Best for: Standard A/B tests where you want precise traffic allocation

Feature-Based

Route requests based on input feature values. For example, route premium customers to a specialized model while standard customers use the default model.

Best for: Segment-specific testing, personalized model selection

Multi-Armed Bandit

Automatically balance exploration (testing new variants) and exploitation (routing to the best performer). The algorithm learns which variants perform best and adjusts traffic allocation over time.

Algorithms available:

  • Epsilon-Greedy: Routes most traffic to the best variant, with a small percentage exploring others
  • Thompson Sampling: Uses Bayesian probability to balance exploration and exploitation
  • UCB1: Upper Confidence Bound algorithm that optimizes for uncertainty

Best for: Continuous optimization, when you want to minimize exposure to poor-performing variants

Creating a Router

  1. Navigate to MLOps → A/B Testing Routers
  2. Click Create Router
  3. Enter a name and description
  4. Select a routing strategy
  5. Add model variants:
    • Choose at least 2 models from your Model Registry
    • Assign variant IDs (e.g., "control", "treatment_a")
    • Set traffic weights (for weighted random)
    • Mark one variant as the control
  6. Configure strategy-specific settings if needed
  7. Click Create Router

Managing Variants

Adjusting Traffic Weights

For weighted random routers, you can adjust traffic allocation in real-time:

  1. Open the router details page
  2. Go to the Variants tab
  3. Adjust the weight slider or enter a specific percentage
  4. Changes take effect immediately

Enabling/Disabling Variants

You can temporarily disable a variant without deleting it:

  1. Toggle the variant's status in the Variants tab
  2. Traffic will be redistributed among remaining enabled variants

Viewing Results

Traffic Distribution

The Traffic tab shows real-time visualization of:

  • Request distribution across variants
  • Traffic over time
  • Success/error rates per variant

Prediction Logs

Every prediction made through the router is logged with:

  • Which variant handled the request
  • Routing decision reason
  • Input features (summary)
  • Prediction result
  • Response latency
  • Success/failure status

Comparing Variants

Use the metrics view to compare variants side-by-side:

  • Success Rate: Percentage of successful predictions
  • Average Latency: Response time in milliseconds
  • Request Volume: Total requests processed

Deploying and Managing

Starting a Router

  1. Click Deploy to deploy the router container to Kubernetes
  2. The router creates an API endpoint for inference
  3. Route your application's prediction requests through this endpoint

Stopping a Router

Click Stop to pause the router while preserving all configuration and historical data.

Using the Router Endpoint

Once deployed, your router exposes a prediction endpoint:

curl -X POST https://your-router-endpoint/predict \
-H "Content-Type: application/json" \
-d '{"features": {"age": 35, "income": 75000}}'

The response includes the prediction result plus routing metadata:

{
"prediction": "approved",
"variant_id": "treatment_a",
"model_id": "model-123",
"latency_ms": 45
}

Providing Feedback (for Bandits)

For multi-armed bandit routing to optimize traffic allocation, you need to provide ground truth data. This uses the same mechanism as drift detection - upload actual outcomes that are matched to predictions via entityId.

When ground truth is uploaded, the system automatically calculates rewards based on prediction accuracy and updates the bandit algorithm's understanding of which variant performs best.

See Drift Detection → Ground Truth Integration for details on uploading ground truth data.

Best Practices

  1. Run tests long enough: Ensure you have statistically significant sample sizes before drawing conclusions
  2. Define success metrics upfront: Know what you're measuring before starting the test
  3. Start with small traffic percentages: Begin with 5-10% traffic to the new variant before increasing
  4. Monitor continuously: Watch for unexpected behavior like increased error rates or latency spikes
  5. Document experiments: Keep notes on why you're running each test and what you learned