Machine Learning Core Cheat Sheet

Updated 2026-04-20

Next Topic: Machine Learning Fundamentals Cheat Sheet

🧠Study flashcards on this topic150 cards · spaced repetition→

Machine learning is a subset of artificial intelligence focused on building systems that learn patterns from data without explicit programming. At its core, machine learning involves training mathematical models on historical data to make predictions or decisions on new, unseen data. Understanding the bias-variance tradeoff is fundamental: models must balance the ability to capture complex patterns (low bias) against stability across different datasets (low variance), as optimizing one often degrades the other. As of 2026, gradient boosting frameworks and neural network architectures continue to dominate, while scikit-learn remains the go-to library for classical ML with native support for missing values and categorical features in its histogram-based estimators.

What This Cheat Sheet Covers

This topic spans 21 focused tables and 216 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Learning ParadigmsTable 2: Supervised Learning Algorithms - ClassificationTable 3: Supervised Learning Algorithms - RegressionTable 4: Unsupervised Learning AlgorithmsTable 5: Reinforcement Learning MethodsTable 6: Evaluation Metrics - ClassificationTable 7: Evaluation Metrics - RegressionTable 8: Model Validation TechniquesTable 9: Regularization TechniquesTable 10: Feature Engineering TechniquesTable 11: Data Preprocessing TechniquesTable 12: Ensemble MethodsTable 13: Hyperparameter TuningTable 14: Neural Network ComponentsTable 15: Optimization AlgorithmsTable 16: Loss FunctionsTable 17: Overfitting and UnderfittingTable 18: SVM KernelsTable 19: Transfer Learning and Fine-TuningTable 20: Model InterpretabilityTable 21: Common Pitfalls and Best Practices

Quick IndexSubscribe to unlock

A jump-to index of every table row in this cheat sheet.

Mind MapSubscribe to unlock

An interactive map of every table and concept in this topic.

Table 1: Learning Paradigms

Every ML project starts with one fundamental question — what kind of signal is the model learning from? These paradigms answer it: whether you have labeled answers (supervised), only raw structure to discover (unsupervised), a reward to chase (reinforcement), or some clever middle ground that squeezes learning out of scarce or self-generated labels.

Paradigm	Example	Description
Supervised Learning	`model.fit(X_train, y_train)`	• Learns from labeled data where each input has a corresponding target output • used for classification and regression tasks.
Unsupervised Learning	`kmeans = KMeans(n_clusters=3)` `kmeans.fit(X)`	• Discovers hidden patterns in unlabeled data without target outputs • used for clustering, dimensionality reduction, and anomaly detection.
Semi-Supervised Learning	`model.fit(X_labeled, y_labeled)` `model.predict(X_unlabeled)`	Combines small labeled dataset with large unlabeled data to improve learning when labeling is expensive.
Self-Supervised Learning	`# predict masked tokens` `output = model(masked_input)`	Generates supervisory signals from data itself by creating pretext tasks like masking or predicting next tokens.

Table 1: Learning Paradigms

Paradigm	Example	Description
Supervised Learning	`model.fit(X_train, y_train)`	• Learns from labeled data where each input has a corresponding target output • used for classification and regression tasks.
Unsupervised Learning	`kmeans = KMeans(n_clusters=3)` `kmeans.fit(X)`	• Discovers hidden patterns in unlabeled data without target outputs • used for clustering, dimensionality reduction, and anomaly detection.
Semi-Supervised Learning	`model.fit(X_labeled, y_labeled)` `model.predict(X_unlabeled)`	Combines small labeled dataset with large unlabeled data to improve learning when labeling is expensive.
Self-Supervised Learning	`# predict masked tokens` `output = model(masked_input)`	Generates supervisory signals from data itself by creating pretext tasks like masking or predicting next tokens.