Semi-supervised learning is a machine learning paradigm that leverages both limited labeled data and abundant unlabeled data to train models, occupying the middle ground between supervised and unsupervised learning. This approach is particularly valuable in domains where labeling is expensive, time-consuming, or requires expert knowledge—such as medical imaging, natural language processing, and computer vision. The core insight rests on key assumptions about data structure: the smoothness assumption (nearby points share labels), the cluster assumption (decision boundaries avoid high-density regions), and the manifold assumption (data lies on lower-dimensional manifolds). Semi-supervised methods use consistency regularization, pseudo-labeling, and graph-based propagation to extract supervisory signals from unlabeled data,enabling models to generalize better than supervised learning alone while avoiding the confirmation bias and error propagation pitfalls that arise when pseudo-labels are incorrect.
What This Cheat Sheet Covers
This topic spans 12 focused tables and 63 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Semi-Supervised Methods
These are the workhorse algorithms most practitioners reach for first, and nearly all of them share a single idea: trust the model's confident predictions on unlabeled data as if they were real labels. FixMatch's weak-then-strong augmentation trick became the template that FlexMatch, MixMatch, and the rest refine — adapting thresholds, mixing samples, or smoothing targets with a teacher. Start here to get the mental model the whole field builds on.
| Method | Example | Description |
|---|---|---|
weak_aug = flip_shift(x)strong_aug = RandAugment(x)if max(p_weak) > tau: loss += CE(p_strong, argmax(p_weak)) | • Combines weak and strong augmentations with confidence thresholding • pseudo-labels generated from weakly augmented images are used to supervise predictions on strongly augmented versions only when confidence exceeds a fixed threshold | |
tau_c = min(tau_max, tau_0 * (sigma_c / sigma_target))if p_c > tau_c: accept_pseudo_label | • Adaptive per-class thresholds via Curriculum Pseudo Labeling (CPL) • dynamically adjusts confidence thresholds based on each class's learning progress, allowing more informative unlabeled samples to enter training earlier | |
q = avg([model(aug_k(x)) for k in K])q_sharp = Sharpen(q, T)X_mix = MixUp(labeled + [(x, q_sharp) for x in unlabeled]) | Guesses low-entropy labels by averaging predictions over K augmentations, sharpens the distribution, then applies MixUp to blend labeled and pseudo-labeled data, combining consistency regularization with entropy minimization. | |
p_model = marginal(unlabeled_preds)p_true = running_avg(labeled_dist)align = p_true / p_model+ CTAugment anchoring | Adds distribution alignment (rescales model predictions toward true label distribution) and augmentation anchoring (weak augmentation anchor with strong variants) to MixMatch, reducing distribution mismatch bias. |