Skip to main content

Drift Detection

Drift detection monitors changes in your model's input data over time and alerts you when the production data differs significantly from the training data.

Why Drift Detection Matters

Machine learning models are trained on historical data, but the real world is constantly changing. Over time, the data your model sees in production may differ from the data it was trained on, causing performance degradation. This is known as drift.

Types of Drift

  1. Data Drift: The statistical distribution of input features changes over time

    • Example: Customer demographics shift, seasonal patterns change
  2. Concept Drift: The relationship between inputs and outputs changes

    • Example: Customer behavior evolves, economic conditions shift
  3. Prediction Drift: Model accuracy degrades compared to baseline performance

    • Requires ground truth labels to detect

How It Works

Reference Baselines

A reference baseline captures the statistical distributions of your training data features. When you run drift analysis, we compare current production data against this baseline to detect changes.

Creating a Baseline:

  1. Export feature distributions from your training dataset
  2. Upload to Strongly via the API or SDK
  3. Set the baseline as active for your model

Statistical Tests

We use multiple statistical methods to detect drift:

MetricUse CaseInterpretation
PSI (Population Stability Index)Overall distribution shift< 0.1: OK, 0.1-0.2: Warning, > 0.2: Alert
Kolmogorov-Smirnov TestNumerical featuresp-value < 0.05 indicates significant drift
Chi-Square TestCategorical featuresp-value < 0.05 indicates significant drift
Jensen-Shannon DivergenceGeneral distribution comparison0-1 scale, higher = more divergence

PSI Thresholds

PSI (Population Stability Index) is the primary metric for drift detection:

  • PSI < 0.1: No significant drift. Data distribution is stable.
  • 0.1 ≤ PSI < 0.2: Moderate drift. Investigate potential causes.
  • PSI ≥ 0.2: Significant drift. Action required. Consider retraining.

Using Drift Detection

Viewing Drift Status

  1. Navigate to MLOps → Drift Detection
  2. See all models with their current drift status
  3. Click on a model to view detailed drift analysis

Running Analysis

  1. Open a model's drift details page
  2. Click Run Analysis
  3. Analysis compares the last 7 days of predictions against your baseline
  4. View results showing:
    • Overall drift score
    • Per-feature PSI values
    • Features with significant drift highlighted

Understanding Results

The drift dashboard shows:

  • Overall Status: Green (OK), Yellow (Warning), or Red (Alert)
  • Drift Score: Aggregate PSI across all features
  • Feature Breakdown: Individual feature drift metrics
  • Trend History: How drift has changed over time

Ground Truth Integration

Ground truth (actual outcomes) allows you to measure prediction drift and model accuracy over time.

Uploading Ground Truth

  1. Open the model's drift details page
  2. Click Upload Ground Truth
  3. Provide data in JSON format:
[
{ "entityId": "customer_123", "actualOutcome": "approved" },
{ "entityId": "customer_456", "actualOutcome": "denied" }
]

The entityId should match the entity ID used when making predictions. This allows us to match ground truth with the corresponding predictions.

Benefits of Ground Truth

With ground truth data, you can:

  • Track actual accuracy over time
  • Detect concept drift (when model accuracy drops)
  • Compare predicted vs actual distributions
  • Get alerts when performance falls below thresholds

Setting Up Alerts

Configure automated notifications when drift is detected:

  1. Set PSI thresholds for warning and alert levels
  2. Enable email or Slack notifications
  3. Choose analysis frequency (hourly, daily, weekly)

Best Practices

1. Establish Baselines Early

Create reference baselines immediately after training. The training data distribution is your ground truth for comparison.

2. Monitor Continuously

Don't wait for problems to appear. Schedule regular drift analysis (daily or weekly) to catch issues early.

3. Investigate Warnings

When drift is detected, investigate before it becomes critical:

  • Is this expected (seasonal change, business event)?
  • Are specific features driving the drift?
  • Is model performance actually affected?

4. Collect Ground Truth

Where possible, collect actual outcomes to measure real model performance, not just data distribution changes.

5. Plan for Retraining

When significant drift is confirmed:

  • Collect recent data with ground truth labels
  • Retrain the model on updated data
  • Create a new baseline
  • Use A/B testing routers to compare the retrained model against the current version

Interpreting Common Scenarios

Seasonal Drift

Pattern: Drift appears at regular intervals (monthly, quarterly) Action: Consider building season-aware models or using different models for different periods

Sudden Drift Spike

Pattern: PSI jumps dramatically in a short time Action: Investigate recent changes: new data sources, pipeline bugs, or external events

Gradual Increase

Pattern: Drift slowly increases over weeks/months Action: Schedule model retraining as part of regular maintenance

Single Feature Drift

Pattern: One feature shows high drift while others are stable Action: Investigate that specific feature. Consider feature engineering or removing if unreliable

Technical Notes

Prediction Logging

All predictions are automatically logged when made through the Strongly AI Gateway. This includes:

  • Input features
  • Model prediction
  • Timestamp
  • Latency
  • Entity ID (if provided)

Data Retention

Prediction logs are retained for 90 days by default. Historical drift analysis results are stored indefinitely.

Performance

Drift analysis is performed asynchronously and typically completes within seconds for datasets up to 100,000 predictions.