MLP Ensemble Deep Features¶

Ensemble of 2 MLPs + XGBoost with soft voting. Achieved 0.647 robust score with 95.3% stability.

Cross-Dataset Performance¶

Metric	Dataset A	Dataset B
ROC AUC	0.7122	0.6787
F1 Score	0.2105	0.3256

Generalization Metric	Value
Robust Score	0.647 (Rank #4)
Stability Score	95.3%
Min AUC	0.6787
Train Time	403s

Generalization Analysis¶

This model showed moderate cross-dataset consistency:

Dataset A AUC: 0.7122 vs Dataset B AUC: 0.6787
Stability of 95.3% indicates reasonable generalization
F1 scores varied: 0.2105 (A) vs 0.3256 (B) - F1 stability: 64.6%

Note: While AUC was stable, F1 scores showed higher variance between datasets.

Architecture¶

flowchart TD
    A["Input Features (114)"] --> B["MLP 1 - Deep 200→100→50→25"]
    A --> C["MLP 2 - Wide 300→150"]
    A --> D["XGBoost 300 trees"]

    B --> E["Soft Voting"]
    C --> E
    D --> E

    E --> F["Final Probability"]

MLP Configurations¶

MLP 1 (Deep Network)¶

MLPClassifier(
    hidden_layer_sizes=(200, 100, 50, 25),  # 4 layers
    activation='relu',
    solver='adam',
    alpha=0.001,
    learning_rate='adaptive',
    learning_rate_init=0.001,
    max_iter=500,
    early_stopping=True,
    validation_fraction=0.15,
    n_iter_no_change=20
)

MLP 2 (Wide Network)¶

MLPClassifier(
    hidden_layer_sizes=(300, 150),  # 2 layers, wider
    activation='tanh',
    solver='adam',
    alpha=0.01,
    learning_rate='adaptive',
    learning_rate_init=0.001,
    max_iter=500,
    early_stopping=True
)

XGBoost Component¶

XGBClassifier(
    n_estimators=300,
    max_depth=6,
    learning_rate=0.05
)

Ensemble Weights¶

Component	Weight
MLP 1 (Deep)	35%
MLP 2 (Wide)	30%
XGBoost	35%

Why This Design¶

Deep Network: Captures complex non-linear feature interactions
Wide Network: Memorizes important patterns with different activation
XGBoost: Provides tree-based complementary predictions
Soft Voting: Averages probabilities for smoother predictions

Usage¶

cd mlp_ensemble_deep_features
python main.py --mode train --data-dir /path/to/data --model-path ./model.joblib

Comparison with Other Models¶

Model	Robust Score	Stability	Dataset A	Dataset B
xgb_tuned_regularization	0.715	96.3%	0.7423	0.7705
mlp_ensemble_deep_features	0.647	95.3%	0.7122	0.6787
wavelet_lstm	0.476	95.3%	0.5249	0.5000
hierarchical_transformer	0.435	89.4%	0.5439	0.4862

Note: This is the only neural network approach that achieved competitive results. Pure deep learning models (LSTM, Transformer) showed near-random AUC (~0.50).