Neural Network Models¶
Neural networks learn hierarchical representations through layers of interconnected nodes.
Overview¶
| Model | ROC AUC | F1 Score | Status |
|---|---|---|---|
| MLP Ensemble | 0.7122 | 0.2105 | Viable |
| Wavelet LSTM | 0.5249 | 0.0000 | Failed |
| Hierarchical Transformer | 0.5439 | 0.0000 | Failed |
Limited Success
Only the MLP ensemble achieved meaningful performance. LSTM and Transformer models predicted all zeros.
Why Neural Networks Underperformed¶
The Core Problem¶
Neural networks are designed to learn relationships between multiple input variables. With only univariate time series and derived features, they lack the rich input space needed.
LSTM Limitations¶
- Fail to memorize long sequential information
- Cannot revise storage decisions dynamically
- Cannot anticipate regime shifts without external data
Transformer Limitations¶
- Require large datasets for meaningful attention patterns
- Underperform in isolation for numerical prediction
- Cross-entropy loss pushes toward majority class
What Would Help¶
To leverage these architectures effectively, add exogenous variables:
| Type | Examples |
|---|---|
| Correlated series | Related assets, sector indices |
| Macroeconomic | Interest rates, VIX |
| Sentiment | News, social media |
| Technical | Volume, bid-ask spread |
MLP Architecture¶
The MLP ensemble is the only successful neural network approach:
Key success factors:
- Ensemble of architectures — Different MLP configurations
- Includes XGBoost — Tree-based component adds diversity
- Direct feature input — Doesn't try to learn from raw sequences
Key Findings¶
Neural Network Performance
mlp_ensemble_deep_features was the only neural approach that performed reasonably well (0.7122 AUC, 0.647 robust score).
Pure deep learning models (LSTM, Transformer) achieved near-random performance on univariate data.