Supervised learning is a machine learning paradigm where models learn from labeled training data to make predictions on unseen data. Each training example consists of an input-output pair, enabling the algorithm to learn a mapping function from inputs to outputs. The two primary tasks are classification (predicting discrete categories) and regression (predicting continuous values). The fundamental challenge lies in balancing the bias-variance tradeoff: simple models underfit (high bias, low variance), complex models overfit (low bias, high variance), and the goal is finding a model that generalizes well by minimizing total error on unseen data. Modern practice increasingly combines strong models with explainability tools like SHAP and LIME to meet regulatory and interpretability demands.
What This Cheat Sheet Covers
This topic spans 27 focused tables and 165 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Learning Task Types
| Type | Example | Description |
|---|---|---|
y_pred = model.predict(X) # [0, 1, 1, 0] | • Assigns each instance to one of two classes • output is a discrete binary label (positive/negative, yes/no, spam/ham). | |
y_pred = model.predict(X) # [0, 2, 1, 3] | • Assigns each instance to exactly one of multiple classes (3+) • mutually exclusive, single-label output. | |
y_pred = model.predict([[1500]]) # 250000.0 | • Predicts continuous numerical values from input features • output is real-valued rather than discrete. |