Machine learning for tabular data sits at the intersection of traditional statistics and modern deep learning. Unlike image or text domains where neural networks reign supreme, tabular data presents unique challenges β heterogeneous feature types, missing values, varied scales, and complex feature interactions β where tree-based gradient boosting methods still dominate Kaggle competitions and production systems. This cheat sheet covers the full spectrum: from XGBoost hyperparameter tuning and CatBoost's native categorical handling to emerging tabular transformers like FT-Transformer and TabNet, plus critical preprocessing techniques, explainability methods, and the practical engineering decisions that separate toy models from production-ready systems.
What This Cheat Sheet Covers
This topic spans 24 focused tables and 124 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Gradient Boosting Libraries
The three dominant gradient boosting libraries each bring distinct optimizations and design philosophies. XGBoost pioneered regularization and sparsity-aware algorithms, LightGBM introduced histogram-based splitting and leaf-wise growth for speed, and CatBoost handles categorical features natively without preprocessing. Choice depends on dataset size, categorical cardinality, hardware constraints, and whether you need GPU acceleration or auto-handling of categories.
| Library | Example | Description |
|---|---|---|
import xgboost as xgbmodel = xgb.XGBClassifier()model.fit(X_train, y_train) | Most mature library with extensive hyperparameter control, strong L1/L2 regularization (alpha/lambda), sparsity-aware split finding for missing values, and excellent documentation; level-wise tree growth balances structure vs depth | |
import lightgbm as lgbmodel = lgb.LGBMClassifier()model.fit(X_train, y_train) | Fastest training on large datasets via histogram-based binning and leaf-wise growth; uses gradient-based one-side sampling (GOSS) to reduce samples and exclusive feature bundling (EFB) to reduce dimensions; lower memory footprint than XGBoost |