Statistical Moments¶
Statistical moments describe the shape and characteristics of a probability distribution.
Mean Difference¶
Feature Name: mean_diff
where: $$ \mu = \frac{1}{n} \sum_{i=1}^{n} x_i $$
Interpretation¶
mean_diff > 0: Post-break segment has higher average valuesmean_diff < 0: Post-break segment has lower average valuesmean_diff ≈ 0: No significant level shift
Why Useful
Level shifts are the most common type of structural break. A sudden change in the mean indicates that the underlying data-generating process has fundamentally changed its baseline.
Used by: All 25 detectors
Median Difference¶
Feature Name: median_diff
Interpretation¶
- More robust to outliers than mean
- Large difference between
mean_diffandmedian_diffsuggests outlier influence
Used by: All 25 detectors
Standard Deviation Ratio¶
Feature Name: std_ratio
where ε = 10⁻⁸ prevents division by zero.
Interpretation¶
std_ratio ≈ 1: No change in volatilitystd_ratio > 1: Volatility increased (e.g., 1.5 = 50% increase)std_ratio < 1: Volatility decreased
Why Useful
Volatility changes are a key type of structural break, especially in financial data. A market might shift from a calm regime to a turbulent one.
Used by: All 25 detectors
Skewness Difference¶
Feature Name: skew_diff
where Fisher's skewness is: $$ \gamma_1 = \frac{1}{n} \sum_{i=1}^{n} \left[\frac{x_i - \mu}{\sigma}\right]^3 $$
Interpretation¶
γ₁ = 0: Symmetric distributionγ₁ > 0: Right-skewed (long right tail)γ₁ < 0: Left-skewed (long left tail)
Used by: All except xgb_core_7features
Kurtosis Difference¶
Feature Name: kurtosis_diff
where excess kurtosis is: $$ \gamma_2 = \frac{1}{n} \sum_{i=1}^{n} \left[\frac{x_i - \mu}{\sigma}\right]^4 - 3 $$
Interpretation¶
γ₂ = 0: Normal-like tails (mesokurtic)γ₂ > 0: Heavy tails, more outliers (leptokurtic)γ₂ < 0: Light tails, fewer outliers (platykurtic)
Why Useful
Changes in kurtosis indicate shifts in the likelihood of extreme events. Financial crises often manifest as increases in kurtosis.
Used by: All except xgb_core_7features
IQR Difference¶
Feature Name: iqr_diff
Interpretation¶
- Robust measure of spread
- Ignores extreme values in tails
- For normal data: IQR ≈ 1.35 × σ
Used by: All except xgb_core_7features