Skip to content

Neural Network Models

Neural networks learn hierarchical representations through layers of interconnected nodes.

Overview

Model ROC AUC F1 Score Status
MLP Ensemble 0.7122 0.2105 Viable
Wavelet LSTM 0.5249 0.0000 Failed
Hierarchical Transformer 0.5439 0.0000 Failed

Limited Success

Only the MLP ensemble achieved meaningful performance. LSTM and Transformer models predicted all zeros.

Why Neural Networks Underperformed

The Core Problem

Neural networks are designed to learn relationships between multiple input variables. With only univariate time series and derived features, they lack the rich input space needed.

LSTM Limitations

  • Fail to memorize long sequential information
  • Cannot revise storage decisions dynamically
  • Cannot anticipate regime shifts without external data

Transformer Limitations

  • Require large datasets for meaningful attention patterns
  • Underperform in isolation for numerical prediction
  • Cross-entropy loss pushes toward majority class

What Would Help

To leverage these architectures effectively, add exogenous variables:

Type Examples
Correlated series Related assets, sector indices
Macroeconomic Interest rates, VIX
Sentiment News, social media
Technical Volume, bid-ask spread

MLP Architecture

The MLP ensemble is the only successful neural network approach:

Features → [MLP Deep (4 layers) + MLP Wide (2 layers) + XGBoost] → Soft Voting → Probability

Key success factors:

  1. Ensemble of architectures — Different MLP configurations
  2. Includes XGBoost — Tree-based component adds diversity
  3. Direct feature input — Doesn't try to learn from raw sequences

Key Findings

Neural Network Performance

mlp_ensemble_deep_features was the only neural approach that performed reasonably well (0.7122 AUC, 0.647 robust score).

Pure deep learning models (LSTM, Transformer) achieved near-random performance on univariate data.