MLP Ensemble Deep Features¶
Ensemble of 2 MLPs + XGBoost with soft voting. Achieved 0.647 robust score with 95.3% stability.
Cross-Dataset Performance¶
| Metric | Dataset A | Dataset B |
|---|---|---|
| ROC AUC | 0.7122 | 0.6787 |
| F1 Score | 0.2105 | 0.3256 |
| Generalization Metric | Value |
|---|---|
| Robust Score | 0.647 (Rank #4) |
| Stability Score | 95.3% |
| Min AUC | 0.6787 |
| Train Time | 403s |
Generalization Analysis¶
This model showed moderate cross-dataset consistency:
- Dataset A AUC: 0.7122 vs Dataset B AUC: 0.6787
- Stability of 95.3% indicates reasonable generalization
- F1 scores varied: 0.2105 (A) vs 0.3256 (B) - F1 stability: 64.6%
Note: While AUC was stable, F1 scores showed higher variance between datasets.
Architecture¶
flowchart TD
A["Input Features (114)"] --> B["MLP 1 - Deep 200→100→50→25"]
A --> C["MLP 2 - Wide 300→150"]
A --> D["XGBoost 300 trees"]
B --> E["Soft Voting"]
C --> E
D --> E
E --> F["Final Probability"]
MLP Configurations¶
MLP 1 (Deep Network)¶
MLPClassifier(
hidden_layer_sizes=(200, 100, 50, 25), # 4 layers
activation='relu',
solver='adam',
alpha=0.001,
learning_rate='adaptive',
learning_rate_init=0.001,
max_iter=500,
early_stopping=True,
validation_fraction=0.15,
n_iter_no_change=20
)
MLP 2 (Wide Network)¶
MLPClassifier(
hidden_layer_sizes=(300, 150), # 2 layers, wider
activation='tanh',
solver='adam',
alpha=0.01,
learning_rate='adaptive',
learning_rate_init=0.001,
max_iter=500,
early_stopping=True
)
XGBoost Component¶
Ensemble Weights¶
| Component | Weight |
|---|---|
| MLP 1 (Deep) | 35% |
| MLP 2 (Wide) | 30% |
| XGBoost | 35% |
Why This Design¶
- Deep Network: Captures complex non-linear feature interactions
- Wide Network: Memorizes important patterns with different activation
- XGBoost: Provides tree-based complementary predictions
- Soft Voting: Averages probabilities for smoother predictions
Usage¶
cd mlp_ensemble_deep_features
python main.py --mode train --data-dir /path/to/data --model-path ./model.joblib
Comparison with Other Models¶
| Model | Robust Score | Stability | Dataset A | Dataset B |
|---|---|---|---|---|
| xgb_tuned_regularization | 0.715 | 96.3% | 0.7423 | 0.7705 |
| mlp_ensemble_deep_features | 0.647 | 95.3% | 0.7122 | 0.6787 |
| wavelet_lstm | 0.476 | 95.3% | 0.5249 | 0.5000 |
| hierarchical_transformer | 0.435 | 89.4% | 0.5439 | 0.4862 |
Note: This is the only neural network approach that achieved competitive results. Pure deep learning models (LSTM, Transformer) showed near-random AUC (~0.50).