Skip to content

Bayesian BOCPD Fused Lasso

4-component ensemble combining Bayesian Change Point Detection with Group Fused LASSO.

Performance

Metric Value Rank
ROC AUC 0.5005 22nd
F1 Score 0.0625 23rd
Accuracy 0.7030 10th
Recall 0.0333 Low
Train Time 183s Medium

Near Random Performance

This model performs barely above random (0.50 AUC) and caught only 1 out of 30 breaks.

Architecture

flowchart TD
    A["๐Ÿ“ˆ Time Series"] --> B1["๐Ÿ”ฎ BOCPD<br/>30%"]
    A --> B2["๐Ÿ“ Group Fused LASSO<br/>20%"]
    A --> B3["๐Ÿ“Š Feature Compare<br/>30%"]

    B1 --> C["๐Ÿ“‹ Combine Scores"]
    B2 --> C
    B3 --> C

    C --> D["๐Ÿงช Statistical Tests<br/>20%"]

    D --> E["๐ŸŽฏ Final Ensemble"]

    style A fill:#e1f5fe
    style B1 fill:#fff3e0
    style B2 fill:#fff3e0
    style B3 fill:#fff3e0
    style D fill:#f3e5f5
    style E fill:#e8f5e9

Components

1. Bayesian Online CPD (30% weight)

Uses Normal-Inverse-Gamma conjugate prior:

\[ P(\text{break at } t | x_{1:t}) \propto \text{Bayes Factor} \]

Prior: $$ \mu, \sigma^2 \sim \text{Normal-Inverse-Gamma}(\mu_0, \kappa_0, \alpha_0, \beta_0) $$

2. Group Fused LASSO (20% weight)

\[ \min_\beta \frac{1}{2}||y - X\beta||^2 + \lambda_1||\beta||_1 + \lambda_2\sum_i||\beta_i - \beta_{i-1}||_1 \]

Detects breaks through coefficient changes across time.

3. Feature Comparison (30% weight)

\[ \text{score} = \min\left(\frac{\text{Cohen's } d}{2}, 1.0\right) \]

4. Statistical Tests (20% weight)

Combined via Fisher's method: $$ \chi^2 = -2\sum \log(p_i) $$

Why It Underperformed

  1. BOCPD sensitivity โ€” Requires careful hyperparameter tuning for each domain
  2. Fused LASSO assumptions โ€” Assumes smooth coefficient changes, may not match data
  3. Ensemble not calibrated โ€” Component weights may not be optimal

Usage

cd bayesian_bocpd_fused_lasso
python main.py --mode train --data-dir /path/to/data --model-path ./model.pkl

Near Random Performance

Despite the sophisticated methodology, this model achieved near-random AUC (~0.50). Included for research comparison.