LightGBM Cheat Sheet

Updated 2026-05-21

Next Topic: Loss Functions in Deep Learning Cheat Sheet

LightGBM (Light Gradient Boosting Machine) is a gradient boosting framework developed by Microsoft, introduced at NeurIPS 2017, and built for high-speed, memory-efficient training on large tabular datasets. It addresses the core bottleneck of traditional GBDT — the expensive exact-split search — through histogram-based learning, leaf-wise tree growth, Gradient-based One-Side Sampling (GOSS), and Exclusive Feature Bundling (EFB). The result is training speeds up to 20× faster than XGBoost with comparable accuracy. The critical mental model: unlike XGBoost's depth-wise growth, LightGBM grows the single leaf with the maximum loss reduction at each step — which converges faster but requires careful tuning of num_leaves and min_data_in_leaf to prevent extreme one-sided trees that overfit small datasets.

What This Cheat Sheet Covers

This topic spans 16 focused tables and 95 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Architecture — Histogram-Based Learning and Leaf-Wise GrowthTable 2: Boosting TypesTable 3: Key Parameters — Tree Structure and ComplexityTable 4: Key Parameters — Learning Rate, Iterations, and SamplingTable 5: Regularization ParametersTable 6: Objective Functions and MetricsTable 7: Python API — Native InterfaceTable 8: Python API — Scikit-learn InterfaceTable 9: Early Stopping and CallbacksTable 10: Feature Importance and SHAP InterpretabilityTable 11: Categorical Feature HandlingTable 12: GPU AccelerationTable 13: Handling Imbalanced DataTable 14: Distributed and Parallel LearningTable 15: Custom Objectives and MetricsTable 16: Advanced Parameters and Specialized Features

Table 1: Core Architecture — Histogram-Based Learning and Leaf-Wise Growth

LightGBM's speed advantage comes from two structural innovations that reshape how trees are built. Understanding these mechanisms makes every subsequent parameter decision more intuitive.

Technique	Example	Description
Histogram-based split finding	`max_bin=255 # default bin count`	Continuous features are discretized into integer bins (default 255), reducing split search from O(#data) to O(#bins) — the primary source of LightGBM's speed advantage.
Leaf-wise (best-first) tree growth	`# grows the single leaf with max delta loss`	• Expands the leaf that reduces loss the most at each step, rather than all leaves at a given depth • converges faster but can overfit without proper `num_leaves` and `min_data_in_leaf` guards
GOSS (Gradient-based One-Side Sampling)	`boosting='gbdt' # GOSS is default in gbdt`	• Keeps all large-gradient instances (more informative) and randomly samples small-gradient ones, rescaling sampled weights • maintains accuracy while training on a fraction of data

Table 1: Core Architecture — Histogram-Based Learning and Leaf-Wise Growth

LightGBM's speed advantage comes from two structural innovations that reshape how trees are built. Understanding these mechanisms makes every subsequent parameter decision more intuitive.

Technique	Example	Description
Histogram-based split finding	`max_bin=255 # default bin count`	Continuous features are discretized into integer bins (default 255), reducing split search from O(#data) to O(#bins) — the primary source of LightGBM's speed advantage.
Leaf-wise (best-first) tree growth	`# grows the single leaf with max delta loss`	• Expands the leaf that reduces loss the most at each step, rather than all leaves at a given depth • converges faster but can overfit without proper `num_leaves` and `min_data_in_leaf` guards
GOSS (Gradient-based One-Side Sampling)	`boosting='gbdt' # GOSS is default in gbdt`	• Keeps all large-gradient instances (more informative) and randomly samples small-gradient ones, rescaling sampled weights • maintains accuracy while training on a fraction of data