Skip to content

MLP Ensemble Deep Features

Ensemble of 2 MLPs + XGBoost with soft voting. Achieved 0.647 robust score with 95.3% stability.

Cross-Dataset Performance

Metric Dataset A Dataset B
ROC AUC 0.7122 0.6787
F1 Score 0.2105 0.3256
Generalization Metric Value
Robust Score 0.647 (Rank #4)
Stability Score 95.3%
Min AUC 0.6787
Train Time 403s

Generalization Analysis

This model showed moderate cross-dataset consistency:

  • Dataset A AUC: 0.7122 vs Dataset B AUC: 0.6787
  • Stability of 95.3% indicates reasonable generalization
  • F1 scores varied: 0.2105 (A) vs 0.3256 (B) - F1 stability: 64.6%

Note: While AUC was stable, F1 scores showed higher variance between datasets.

Architecture

flowchart TD
    A["Input Features (114)"] --> B["MLP 1 - Deep 200→100→50→25"]
    A --> C["MLP 2 - Wide 300→150"]
    A --> D["XGBoost 300 trees"]

    B --> E["Soft Voting"]
    C --> E
    D --> E

    E --> F["Final Probability"]

MLP Configurations

MLP 1 (Deep Network)

MLPClassifier(
    hidden_layer_sizes=(200, 100, 50, 25),  # 4 layers
    activation='relu',
    solver='adam',
    alpha=0.001,
    learning_rate='adaptive',
    learning_rate_init=0.001,
    max_iter=500,
    early_stopping=True,
    validation_fraction=0.15,
    n_iter_no_change=20
)

MLP 2 (Wide Network)

MLPClassifier(
    hidden_layer_sizes=(300, 150),  # 2 layers, wider
    activation='tanh',
    solver='adam',
    alpha=0.01,
    learning_rate='adaptive',
    learning_rate_init=0.001,
    max_iter=500,
    early_stopping=True
)

XGBoost Component

XGBClassifier(
    n_estimators=300,
    max_depth=6,
    learning_rate=0.05
)

Ensemble Weights

Component Weight
MLP 1 (Deep) 35%
MLP 2 (Wide) 30%
XGBoost 35%

Why This Design

  • Deep Network: Captures complex non-linear feature interactions
  • Wide Network: Memorizes important patterns with different activation
  • XGBoost: Provides tree-based complementary predictions
  • Soft Voting: Averages probabilities for smoother predictions

Usage

cd mlp_ensemble_deep_features
python main.py --mode train --data-dir /path/to/data --model-path ./model.joblib

Comparison with Other Models

Model Robust Score Stability Dataset A Dataset B
xgb_tuned_regularization 0.715 96.3% 0.7423 0.7705
mlp_ensemble_deep_features 0.647 95.3% 0.7122 0.6787
wavelet_lstm 0.476 95.3% 0.5249 0.5000
hierarchical_transformer 0.435 89.4% 0.5439 0.4862

Note: This is the only neural network approach that achieved competitive results. Pure deep learning models (LSTM, Transformer) showed near-random AUC (~0.50).