Drift Detection
Drift detection monitors changes in your model's input data over time and alerts you when the production data differs significantly from the training data.
Why Drift Detection Matters
Machine learning models are trained on historical data, but the real world is constantly changing. Over time, the data your model sees in production may differ from the data it was trained on, causing performance degradation. This is known as drift.
Types of Drift
-
Data Drift: The statistical distribution of input features changes over time
- Example: Customer demographics shift, seasonal patterns change
-
Concept Drift: The relationship between inputs and outputs changes
- Example: Customer behavior evolves, economic conditions shift
-
Prediction Drift: Model accuracy degrades compared to baseline performance
- Requires ground truth labels to detect
How It Works
Reference Baselines
A reference baseline captures the statistical distributions of your training data features. When you run drift analysis, we compare current production data against this baseline to detect changes.
Creating a Baseline:
- Export feature distributions from your training dataset
- Upload to Strongly via the API or SDK
- Set the baseline as active for your model
Statistical Tests
We use multiple statistical methods to detect drift:
| Metric | Use Case | Interpretation |
|---|---|---|
| PSI (Population Stability Index) | Overall distribution shift | < 0.1: OK, 0.1-0.2: Warning, > 0.2: Alert |
| Kolmogorov-Smirnov Test | Numerical features | p-value < 0.05 indicates significant drift |
| Chi-Square Test | Categorical features | p-value < 0.05 indicates significant drift |
| Jensen-Shannon Divergence | General distribution comparison | 0-1 scale, higher = more divergence |
PSI Thresholds
PSI (Population Stability Index) is the primary metric for drift detection:
- PSI < 0.1: No significant drift. Data distribution is stable.
- 0.1 ≤ PSI < 0.2: Moderate drift. Investigate potential causes.
- PSI ≥ 0.2: Significant drift. Action required. Consider retraining.
Using Drift Detection
Viewing Drift Status
- Navigate to MLOps → Drift Detection
- See all models with their current drift status
- Click on a model to view detailed drift analysis
Running Analysis
- Open a model's drift details page
- Click Run Analysis
- Analysis compares the last 7 days of predictions against your baseline
- View results showing:
- Overall drift score
- Per-feature PSI values
- Features with significant drift highlighted
Understanding Results
The drift dashboard shows:
- Overall Status: Green (OK), Yellow (Warning), or Red (Alert)
- Drift Score: Aggregate PSI across all features
- Feature Breakdown: Individual feature drift metrics
- Trend History: How drift has changed over time
Ground Truth Integration
Ground truth (actual outcomes) allows you to measure prediction drift and model accuracy over time.
Uploading Ground Truth
- Open the model's drift details page
- Click Upload Ground Truth
- Provide data in JSON format:
[
{ "entityId": "customer_123", "actualOutcome": "approved" },
{ "entityId": "customer_456", "actualOutcome": "denied" }
]
The entityId should match the entity ID used when making predictions. This allows us to match ground truth with the corresponding predictions.
Benefits of Ground Truth
With ground truth data, you can:
- Track actual accuracy over time
- Detect concept drift (when model accuracy drops)
- Compare predicted vs actual distributions
- Get alerts when performance falls below thresholds
Setting Up Alerts
Configure automated notifications when drift is detected:
- Set PSI thresholds for warning and alert levels
- Enable email or Slack notifications
- Choose analysis frequency (hourly, daily, weekly)
Best Practices
1. Establish Baselines Early
Create reference baselines immediately after training. The training data distribution is your ground truth for comparison.
2. Monitor Continuously
Don't wait for problems to appear. Schedule regular drift analysis (daily or weekly) to catch issues early.
3. Investigate Warnings
When drift is detected, investigate before it becomes critical:
- Is this expected (seasonal change, business event)?
- Are specific features driving the drift?
- Is model performance actually affected?
4. Collect Ground Truth
Where possible, collect actual outcomes to measure real model performance, not just data distribution changes.
5. Plan for Retraining
When significant drift is confirmed:
- Collect recent data with ground truth labels
- Retrain the model on updated data
- Create a new baseline
- Use A/B testing routers to compare the retrained model against the current version
Interpreting Common Scenarios
Seasonal Drift
Pattern: Drift appears at regular intervals (monthly, quarterly) Action: Consider building season-aware models or using different models for different periods
Sudden Drift Spike
Pattern: PSI jumps dramatically in a short time Action: Investigate recent changes: new data sources, pipeline bugs, or external events
Gradual Increase
Pattern: Drift slowly increases over weeks/months Action: Schedule model retraining as part of regular maintenance
Single Feature Drift
Pattern: One feature shows high drift while others are stable Action: Investigate that specific feature. Consider feature engineering or removing if unreliable
Technical Notes
Prediction Logging
All predictions are automatically logged when made through the Strongly AI Gateway. This includes:
- Input features
- Model prediction
- Timestamp
- Latency
- Entity ID (if provided)
Data Retention
Prediction logs are retained for 90 days by default. Historical drift analysis results are stored indefinitely.
Performance
Drift analysis is performed asynchronously and typically completes within seconds for datasets up to 100,000 predictions.