Skip to content

Root Cause Analysis

What it answers: Why did this metric change between two periods?

How it works

  1. Computes metric value for period A and period B
  2. For each provided dimension, computes per-segment values in both periods
  3. Calculates contribution: segment_delta / total_delta
  4. Ranks dimensions by maximum segment contribution
  5. Identifies the primary driver — the segment explaining the most change
  6. Returns structured explanation with confidence score

Example

python
result = om.analysis.root_cause(
    "quarterly_revenue_by_country",
    compare={"current": "2011-07-01", "previous": "2010-07-01"},
    dimensions=["country"],
    target="revenue_gbp",
)

Response:

python
{
    "value": {
        "comparison": {"current": "2011-07-01", "previous": "2010-07-01"},
        "total_change": 170779,
        "direction": "increased",
        "primary_dimension": "country",
        "primary_driver": {"dimension": {"country": "United Kingdom"}, "reason": "largest positive contribution"},
        "dimension_rankings": [
            {"dimension": "country", "max_contribution": 0.85, "top_segments": [...]}
        ]
    },
    "explanation": "quarterly_revenue_by_country increased from 2010-07-01 to 2011-07-01. Primary driver: country=United Kingdom.",
    "confidence": 0.85,
    "suggested_actions": ["Understand what drove country=United Kingdom growth — replicate elsewhere"],
    "insights": ["United Kingdom accounts for 85% of the total change"]
}

When to use it

Use root_cause when you have an unexpected metric movement and need to understand the cause before deciding how to respond. It's most valuable when you have multiple possible explanations and want to rank them by explanatory power rather than intuition.

Interpreting results

ContributionMeaning
> 0.5Strong single cause — high confidence action
0.2 - 0.5Meaningful but not dominant — investigate alongside others
< 0.2Distributed causes — no single fix, look at systemic issues
ConfidenceMeaning
> 0.7Sufficient data, trust the finding
0.5 - 0.7Directional — useful but verify
< 0.5Insufficient data in some segments

Limitations

  • Requires at least 2 periods of data for comparison
  • Does not establish causation — only explains statistical variance
  • Works best when the metric has 3+ declared dimensions
  • For metrics without time filters, uses client-side period splitting (requires time column in output)
  • Large datasets may take longer (runs one contribution analysis per dimension)

Parameters

ParameterTypeRequiredDescription
metricstringyesMetric name
comparedictyes{"current": "2025-02", "previous": "2025-01"}
dimensionslistyesDimensions to investigate
targetstringnoMeasure column (auto-detected)
limitintegernoMax rows per query

MIT Licensed (SDK) | Proprietary (Server)