Few-shot and zero-shot learning are machine learning paradigms that enable models to generalize to new tasks or classes with minimal labeled examplesβranging from none (zero-shot) to a small handful (few-shot). These approaches are foundational to in-context learning in large language models and meta-learning in computer vision, where models learn to adapt quickly by transferring knowledge from prior experience rather than requiring extensive task-specific training. The key challenge is to design representations, prompting strategies, and meta-learning algorithms that maximize generalization from extremely limited supervision, making these techniques essential for real-world applications where labeled data is scarce, expensive, or rapidly changing. Understanding the nuances between demonstration selection, calibration methods, and architectural choices directly impacts whether a model performs near state-of-the-art or random-guess accuracy on new tasks.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 97 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Learning Paradigms
| Paradigm | Example | Description |
|---|---|---|
Classify sentiment: "I loved it!" β Positive | β’ Model performs a task with no examples, relying entirely on pre-trained knowledge and task instructions β’ works best for generalized tasks that match training distribution. | |
Example: "Great movie" β PositiveClassify: "Terrible film" | β’ Model receives exactly one example per class before inference β’ bridges zero-shot and few-shot by providing minimal demonstration of desired behavior. | |
3 examples:"Loved it" β Pos"Hated it" β Neg"Okay" β Neu | β’ Model learns from 2β10 labeled examples per class β’ significantly improves performance over zero-shot for specialized or nuanced tasks. | |
k=5: 5 examples per class | β’ Formalization of few-shot where k specifies the exact number of examples per class β’ typically k β {1, 2, 5, 10} in research benchmarks. |