Bayesian BOCPD Fused Lasso¶

4-component ensemble combining Bayesian Change Point Detection with Group Fused LASSO.

Performance¶

Metric	Value	Rank
ROC AUC	0.5005	22nd
F1 Score	0.0625	23rd
Accuracy	0.7030	10th
Recall	0.0333	Low
Train Time	183s	Medium

Near Random Performance

This model performs barely above random (0.50 AUC) and caught only 1 out of 30 breaks.

Architecture¶

flowchart TD
    A["📈 Time Series"] --> B1["🔮 BOCPD<br/>30%"]
    A --> B2["📐 Group Fused LASSO<br/>20%"]
    A --> B3["📊 Feature Compare<br/>30%"]

    B1 --> C["📋 Combine Scores"]
    B2 --> C
    B3 --> C

    C --> D["🧪 Statistical Tests<br/>20%"]

    D --> E["🎯 Final Ensemble"]

    style A fill:#e1f5fe
    style B1 fill:#fff3e0
    style B2 fill:#fff3e0
    style B3 fill:#fff3e0
    style D fill:#f3e5f5
    style E fill:#e8f5e9

Components¶

1. Bayesian Online CPD (30% weight)¶

Uses Normal-Inverse-Gamma conjugate prior:

\[ P(\text{break at } t | x_{1:t}) \propto \text{Bayes Factor} \]

Prior: $$ \mu, \sigma^2 \sim \text{Normal-Inverse-Gamma}(\mu_0, \kappa_0, \alpha_0, \beta_0) $$

2. Group Fused LASSO (20% weight)¶

\[ \min_\beta \frac{1}{2}||y - X\beta||^2 + \lambda_1||\beta||_1 + \lambda_2\sum_i||\beta_i - \beta_{i-1}||_1 \]

Detects breaks through coefficient changes across time.

3. Feature Comparison (30% weight)¶

\[ \text{score} = \min\left(\frac{\text{Cohen's } d}{2}, 1.0\right) \]

4. Statistical Tests (20% weight)¶

Combined via Fisher's method: $$ \chi^2 = -2\sum \log(p_i) $$

Why It Underperformed¶

BOCPD sensitivity — Requires careful hyperparameter tuning for each domain
Fused LASSO assumptions — Assumes smooth coefficient changes, may not match data
Ensemble not calibrated — Component weights may not be optimal

Usage¶

cd bayesian_bocpd_fused_lasso
python main.py --mode train --data-dir /path/to/data --model-path ./model.pkl

Near Random Performance

Despite the sophisticated methodology, this model achieved near-random AUC (~0.50). Included for research comparison.