Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

LightGBM Cheat Sheet

LightGBM Cheat Sheet

Back to AI and Machine Learning
Updated 2026-05-21
Next Topic: Loss Functions in Deep Learning Cheat Sheet

LightGBM (Light Gradient Boosting Machine) is a gradient boosting framework developed by Microsoft, introduced at NeurIPS 2017, and built for high-speed, memory-efficient training on large tabular datasets. It addresses the core bottleneck of traditional GBDT — the expensive exact-split search — through histogram-based learning, leaf-wise tree growth, Gradient-based One-Side Sampling (GOSS), and Exclusive Feature Bundling (EFB). The result is training speeds up to 20× faster than XGBoost with comparable accuracy. The critical mental model: unlike XGBoost's depth-wise growth, LightGBM grows the single leaf with the maximum loss reduction at each step — which converges faster but requires careful tuning of num_leaves and min_data_in_leaf to prevent extreme one-sided trees that overfit small datasets.

What This Cheat Sheet Covers

This topic spans 16 focused tables and 95 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Architecture — Histogram-Based Learning and Leaf-Wise GrowthTable 2: Boosting TypesTable 3: Key Parameters — Tree Structure and ComplexityTable 4: Key Parameters — Learning Rate, Iterations, and SamplingTable 5: Regularization ParametersTable 6: Objective Functions and MetricsTable 7: Python API — Native InterfaceTable 8: Python API — Scikit-learn InterfaceTable 9: Early Stopping and CallbacksTable 10: Feature Importance and SHAP InterpretabilityTable 11: Categorical Feature HandlingTable 12: GPU AccelerationTable 13: Handling Imbalanced DataTable 14: Distributed and Parallel LearningTable 15: Custom Objectives and MetricsTable 16: Advanced Parameters and Specialized Features

Table 1: Core Architecture — Histogram-Based Learning and Leaf-Wise Growth

LightGBM's speed advantage comes from two structural innovations that reshape how trees are built. Understanding these mechanisms makes every subsequent parameter decision more intuitive.

TechniqueExampleDescription
Histogram-based split finding
max_bin=255 # default bin count
Continuous features are discretized into integer bins (default 255), reducing split search from O(#data) to O(#bins) — the primary source of LightGBM's speed advantage.
Leaf-wise (best-first) tree growth
# grows the single leaf with max delta loss
• Expands the leaf that reduces loss the most at each step, rather than all leaves at a given depth
• converges faster but can overfit without proper num_leaves and min_data_in_leaf guards
GOSS (Gradient-based One-Side Sampling)
boosting='gbdt' # GOSS is default in gbdt
• Keeps all large-gradient instances (more informative) and randomly samples small-gradient ones, rescaling sampled weights
• maintains accuracy while training on a fraction of data

More in AI and Machine Learning

  • Kubeflow Cheat Sheet
  • Loss Functions in Deep Learning Cheat Sheet
  • AI Bias & Fairness Cheat Sheet
  • Edge AI and TinyML Cheat Sheet
  • MLflow Cheat Sheet
  • PyTorch Cheat Sheet
View all 83 topics in AI and Machine Learning