Bayesian BOCPD Fused Lasso¶
4-component ensemble combining Bayesian Change Point Detection with Group Fused LASSO.
Performance¶
| Metric | Value | Rank |
|---|---|---|
| ROC AUC | 0.5005 | 22nd |
| F1 Score | 0.0625 | 23rd |
| Accuracy | 0.7030 | 10th |
| Recall | 0.0333 | Low |
| Train Time | 183s | Medium |
Near Random Performance
This model performs barely above random (0.50 AUC) and caught only 1 out of 30 breaks.
Architecture¶
flowchart TD
A["๐ Time Series"] --> B1["๐ฎ BOCPD<br/>30%"]
A --> B2["๐ Group Fused LASSO<br/>20%"]
A --> B3["๐ Feature Compare<br/>30%"]
B1 --> C["๐ Combine Scores"]
B2 --> C
B3 --> C
C --> D["๐งช Statistical Tests<br/>20%"]
D --> E["๐ฏ Final Ensemble"]
style A fill:#e1f5fe
style B1 fill:#fff3e0
style B2 fill:#fff3e0
style B3 fill:#fff3e0
style D fill:#f3e5f5
style E fill:#e8f5e9
Components¶
1. Bayesian Online CPD (30% weight)¶
Uses Normal-Inverse-Gamma conjugate prior:
\[
P(\text{break at } t | x_{1:t}) \propto \text{Bayes Factor}
\]
Prior: $$ \mu, \sigma^2 \sim \text{Normal-Inverse-Gamma}(\mu_0, \kappa_0, \alpha_0, \beta_0) $$
2. Group Fused LASSO (20% weight)¶
\[
\min_\beta \frac{1}{2}||y - X\beta||^2 + \lambda_1||\beta||_1 + \lambda_2\sum_i||\beta_i - \beta_{i-1}||_1
\]
Detects breaks through coefficient changes across time.
3. Feature Comparison (30% weight)¶
\[
\text{score} = \min\left(\frac{\text{Cohen's } d}{2}, 1.0\right)
\]
4. Statistical Tests (20% weight)¶
Combined via Fisher's method: $$ \chi^2 = -2\sum \log(p_i) $$
Why It Underperformed¶
- BOCPD sensitivity โ Requires careful hyperparameter tuning for each domain
- Fused LASSO assumptions โ Assumes smooth coefficient changes, may not match data
- Ensemble not calibrated โ Component weights may not be optimal
Usage¶
cd bayesian_bocpd_fused_lasso
python main.py --mode train --data-dir /path/to/data --model-path ./model.pkl
Near Random Performance
Despite the sophisticated methodology, this model achieved near-random AUC (~0.50). Included for research comparison.