XGB Core 7 Features¶
Minimal XGBoost model using only 7 carefully selected features. Achieved high stability (98.0%) with fast training.
Cross-Dataset Performance¶
| Metric | Dataset A | Dataset B |
|---|---|---|
| ROC AUC | 0.6188 | 0.6315 |
| F1 Score | 0.4675 | 0.4571 |
| Generalization Metric | Value |
|---|---|
| Robust Score | 0.606 (Rank #8) |
| Stability Score | 98.0% |
| Min AUC | 0.6188 |
| Train Time | 40s (Fastest) |
Generalization Analysis¶
This model showed highly consistent performance across datasets:
- Stability of 98.0% - one of the most stable models
- Dataset A AUC (0.6188) vs Dataset B AUC (0.6315) - minimal variance
- F1 scores also stable: 0.4675 (A) vs 0.4571 (B)
Architecture¶
The 7 Core Features¶
| Feature | Purpose |
|---|---|
mean_diff |
Level shift magnitude |
std_ratio |
Volatility change |
cohens_d |
Standardized effect size |
ks_statistic |
Distribution difference |
welch_t_stat |
Mean significance |
median_diff |
Robust location shift |
iqr_ratio |
Robust spread change |
These 7 features capture the essential aspects of structural breaks:
- Location: mean_diff, median_diff, welch_t_stat
- Scale: std_ratio, iqr_ratio
- Effect: cohens_d
- Distribution: ks_statistic
Hyperparameters¶
XGBClassifier(
n_estimators=200,
max_depth=6,
learning_rate=0.05,
subsample=0.8,
colsample_bytree=0.8,
min_child_weight=3,
gamma=0.1,
reg_alpha=0.1,
reg_lambda=1.0
)
Usage¶
cd xgb_core_7features
python main.py --mode train --data-dir /path/to/data --model-path ./model.joblib
Comparison with Other Models¶
| Model | Features | Robust Score | Stability | Dataset A | Dataset B |
|---|---|---|---|---|---|
| xgb_core_7features | 7 | 0.606 | 98.0% | 0.6188 | 0.6315 |
| xgb_tuned_regularization | 70+ | 0.715 | 96.3% | 0.7423 | 0.7705 |
| gradient_boost_comprehensive | 100+ | 0.538 | 82.4% | 0.7930 | 0.6533 |
Note: While gradient_boost achieved the highest Dataset A AUC (0.7930), its low stability (82.4%) resulted in a lower robust score than this 7-feature model.