Meta Stacking 7 Models¶

Note on Results

These results are based on local validation sets provided during the competition phase and do not represent final official leaderboard standings.

7-model stacking ensemble that achieved high Dataset A performance in local validation but showed significant overfitting (83.8% stability).

Cross-Dataset Performance¶

Metric	Dataset A	Dataset B
ROC AUC	0.7662	0.6422
F1 Score	0.5417	0.3111

Generalization Metric	Value
Robust Score	0.538 (Rank #12)
Stability Score	83.8%
Min AUC	0.6422
AUC Drop	-16.2%
Train Time	~9 hours

Generalization Analysis¶

This model showed significant performance variance between datasets:

Dataset A AUC: 0.7662 (Rank #2)
Dataset B AUC: 0.6422 (Rank #10)
Dropped 8 ranks between datasets
Stability of 83.8% indicates overfitting
F1 also dropped significantly: 0.5417 (A) vs 0.3111 (B)

Architecture¶

flowchart TD
    A["Input Features (339)"] --> B["Feature Selection → 200 features"]

    B --> C1["MLP 1 Deep"]
    B --> C2["MLP 2 Wide"]
    B --> C3["Gradient Boosting"]
    B --> C4["Random Forest"]
    B --> C5["XGBoost"]
    B --> C6["LightGBM"]
    B --> C7["ExtraTrees"]

    C1 --> D["Out-of-Fold Predictions"]
    C2 --> D
    C3 --> D
    C4 --> D
    C5 --> D
    C6 --> D
    C7 --> D

    D --> E["Meta-Learner (LogReg)"]
    E --> F["Final Probability"]

Base Models¶

Neural Networks¶

MLP 1 (Deep)

MLPClassifier(
    hidden_layer_sizes=(256, 128, 64, 32),
    activation='relu',
    alpha=0.0005
)

MLP 2 (Wide)

MLPClassifier(
    hidden_layer_sizes=(400, 200, 100),
    activation='tanh',
    alpha=0.005
)

Tree-Based¶

XGBoost

XGBClassifier(
    n_estimators=500,
    max_depth=12,
    learning_rate=0.03
)

LightGBM

LGBMClassifier(
    n_estimators=500,
    max_depth=12,
    learning_rate=0.03
)

Meta-Learner¶

LogisticRegression(C=1.0, max_iter=1000)

Feature Set (339 Features)¶

Statistical moments per segment
Distribution comparisons (Wasserstein, KL divergence)
Wavelet decomposition coefficients (db4, sym4, coif2)
Spectral features from FFT
Hypothesis test statistics

Stacking Process¶

Split training data into K folds
For each fold: Train base models on K-1 folds, predict on held-out fold
Combine all out-of-fold predictions as meta-features
Train meta-learner on meta-features

Usage¶

cd meta_stacking_7models
python main.py --mode train --data-dir /path/to/data --model-path ./model.joblib

Comparison with Other Models¶

Model	Robust Score	Stability	Dataset A	Dataset B
xgb_tuned_regularization	0.715	96.3%	0.7423	0.7705
weighted_dynamic_ensemble	0.664	98.4%	0.6742	0.6849
meta_stacking_7models	0.538	83.8%	0.7662	0.6422

Note: Despite achieving the highest F1 score on Dataset A, the model's low stability (83.8%) resulted in a robust score of only 0.538 and a large performance drop on Dataset B.