A/B Testing Routers
A/B testing routers enable you to split traffic between multiple model variants, compare their performance, and make data-driven decisions about which model to deploy to production.
Why Use A/B Testing?
Machine learning models need continuous validation in production. A/B testing helps you:
- Validate Model Changes: Before fully deploying a new model, test it against your current production model with real traffic
- Measure Impact: Quantify the difference in accuracy, latency, and business metrics between model versions
- Reduce Risk: Gradually roll out new models rather than replacing everything at once
- Optimize Continuously: Use multi-armed bandit algorithms to automatically route more traffic to better-performing models
Routing Strategies
Weighted Random
Split traffic randomly based on configured percentages. For example, route 80% of requests to the control model and 20% to a new variant.
Best for: Standard A/B tests where you want precise traffic allocation
Feature-Based
Route requests based on input feature values. For example, route premium customers to a specialized model while standard customers use the default model.
Best for: Segment-specific testing, personalized model selection
Multi-Armed Bandit
Automatically balance exploration (testing new variants) and exploitation (routing to the best performer). The algorithm learns which variants perform best and adjusts traffic allocation over time.
Algorithms available:
- Epsilon-Greedy: Routes most traffic to the best variant, with a small percentage exploring others
- Thompson Sampling: Uses Bayesian probability to balance exploration and exploitation
- UCB1: Upper Confidence Bound algorithm that optimizes for uncertainty
Best for: Continuous optimization, when you want to minimize exposure to poor-performing variants
Creating a Router
- Navigate to MLOps → A/B Testing Routers
- Click Create Router
- Enter a name and description
- Select a routing strategy
- Add model variants:
- Choose at least 2 models from your Model Registry
- Assign variant IDs (e.g., "control", "treatment_a")
- Set traffic weights (for weighted random)
- Mark one variant as the control
- Configure strategy-specific settings if needed
- Click Create Router
Managing Variants
Adjusting Traffic Weights
For weighted random routers, you can adjust traffic allocation in real-time:
- Open the router details page
- Go to the Variants tab
- Adjust the weight slider or enter a specific percentage
- Changes take effect immediately
Enabling/Disabling Variants
You can temporarily disable a variant without deleting it:
- Toggle the variant's status in the Variants tab
- Traffic will be redistributed among remaining enabled variants
Viewing Results
Traffic Distribution
The Traffic tab shows real-time visualization of:
- Request distribution across variants
- Traffic over time
- Success/error rates per variant
Prediction Logs
Every prediction made through the router is logged with:
- Which variant handled the request
- Routing decision reason
- Input features (summary)
- Prediction result
- Response latency
- Success/failure status
Comparing Variants
Use the metrics view to compare variants side-by-side:
- Success Rate: Percentage of successful predictions
- Average Latency: Response time in milliseconds
- Request Volume: Total requests processed
Deploying and Managing
Starting a Router
- Click Deploy to deploy the router container to Kubernetes
- The router creates an API endpoint for inference
- Route your application's prediction requests through this endpoint
Stopping a Router
Click Stop to pause the router while preserving all configuration and historical data.
Using the Router Endpoint
Once deployed, your router exposes a prediction endpoint:
curl -X POST https://your-router-endpoint/predict \
-H "Content-Type: application/json" \
-d '{"features": {"age": 35, "income": 75000}}'
The response includes the prediction result plus routing metadata:
{
"prediction": "approved",
"variant_id": "treatment_a",
"model_id": "model-123",
"latency_ms": 45
}
Providing Feedback (for Bandits)
For multi-armed bandit routing to optimize traffic allocation, you need to provide ground truth data. This uses the same mechanism as drift detection - upload actual outcomes that are matched to predictions via entityId.
When ground truth is uploaded, the system automatically calculates rewards based on prediction accuracy and updates the bandit algorithm's understanding of which variant performs best.
See Drift Detection → Ground Truth Integration for details on uploading ground truth data.
Best Practices
- Run tests long enough: Ensure you have statistically significant sample sizes before drawing conclusions
- Define success metrics upfront: Know what you're measuring before starting the test
- Start with small traffic percentages: Begin with 5-10% traffic to the new variant before increasing
- Monitor continuously: Watch for unexpected behavior like increased error rates or latency spikes
- Document experiments: Keep notes on why you're running each test and what you learned