Skip to content

Statistical Moments

Statistical moments describe the shape and characteristics of a probability distribution.

Mean Difference

Feature Name: mean_diff

\[ \text{mean\_diff} = \mu_{\text{post}} - \mu_{\text{pre}} \]

where: $$ \mu = \frac{1}{n} \sum_{i=1}^{n} x_i $$

Interpretation

  • mean_diff > 0: Post-break segment has higher average values
  • mean_diff < 0: Post-break segment has lower average values
  • mean_diff ≈ 0: No significant level shift

Why Useful

Level shifts are the most common type of structural break. A sudden change in the mean indicates that the underlying data-generating process has fundamentally changed its baseline.

Used by: All 25 detectors


Median Difference

Feature Name: median_diff

\[ \text{median\_diff} = M_{\text{post}} - M_{\text{pre}} \]

Interpretation

  • More robust to outliers than mean
  • Large difference between mean_diff and median_diff suggests outlier influence

Used by: All 25 detectors


Standard Deviation Ratio

Feature Name: std_ratio

\[ \text{std\_ratio} = \frac{\sigma_{\text{post}}}{\sigma_{\text{pre}} + \epsilon} \]

where ε = 10⁻⁸ prevents division by zero.

Interpretation

  • std_ratio ≈ 1: No change in volatility
  • std_ratio > 1: Volatility increased (e.g., 1.5 = 50% increase)
  • std_ratio < 1: Volatility decreased

Why Useful

Volatility changes are a key type of structural break, especially in financial data. A market might shift from a calm regime to a turbulent one.

Used by: All 25 detectors


Skewness Difference

Feature Name: skew_diff

\[ \text{skew\_diff} = \gamma_1(\text{post}) - \gamma_1(\text{pre}) \]

where Fisher's skewness is: $$ \gamma_1 = \frac{1}{n} \sum_{i=1}^{n} \left[\frac{x_i - \mu}{\sigma}\right]^3 $$

Interpretation

  • γ₁ = 0: Symmetric distribution
  • γ₁ > 0: Right-skewed (long right tail)
  • γ₁ < 0: Left-skewed (long left tail)

Used by: All except xgb_core_7features


Kurtosis Difference

Feature Name: kurtosis_diff

\[ \text{kurtosis\_diff} = \gamma_2(\text{post}) - \gamma_2(\text{pre}) \]

where excess kurtosis is: $$ \gamma_2 = \frac{1}{n} \sum_{i=1}^{n} \left[\frac{x_i - \mu}{\sigma}\right]^4 - 3 $$

Interpretation

  • γ₂ = 0: Normal-like tails (mesokurtic)
  • γ₂ > 0: Heavy tails, more outliers (leptokurtic)
  • γ₂ < 0: Light tails, fewer outliers (platykurtic)

Why Useful

Changes in kurtosis indicate shifts in the likelihood of extreme events. Financial crises often manifest as increases in kurtosis.

Used by: All except xgb_core_7features


IQR Difference

Feature Name: iqr_diff

\[ \text{IQR} = Q_3 - Q_1 = P_{75} - P_{25} \]

Interpretation

  • Robust measure of spread
  • Ignores extreme values in tails
  • For normal data: IQR ≈ 1.35 × σ

Used by: All except xgb_core_7features