Hypothesis Testing Pure¶

Ensemble of statistical tests with no training required.

Performance¶

Highest Recall

This model has the highest recall (0.67), catching more breaks than any other model — but at the cost of many false positives.

Five statistical tests combined with weighted voting:

Time Series → [t-test, KS, CUSUM, LR, Bayes] → Weighted Score → Probability

\[ t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \]

score = -log₁₀(p) × min(n₁, n₂) / max(n₁, n₂)

\[ D = \sup_x |F_1(x) - F_2(x)| \]

score = D × (-log₁₀(p))

\[ S_t = \sum_{i=1}^{t} \frac{x_i - \hat{\mu}}{\hat{\sigma}} \]

score = max|S| in boundary region

\[ \Lambda = -2 \log\left(\frac{L_{\text{single}}}{L_{\text{two-segment}}}\right) \]

Under null hypothesis, Λ ~ χ²(df).

\[ \log BF = -\frac{1}{2}(BIC_{\text{two}} - BIC_{\text{single}}) \]

\[ P(\text{break}|\text{data}) = \frac{1}{1 + \exp(-\log BF - \log(\text{prior odds}))} \]

score = (0.25 × t_score +
         0.20 × ks_score +
         0.15 × cusum_score +
         0.20 × lr_score +
         0.20 × bayes_score)

cd hypothesis_testing_pure
python main.py --mode infer --data-dir /path/to/data

No --mode train needed — this model doesn't require training.

Good For

Avoid If