Large Language Models (LLMs) represent a transformative shift in artificial intelligence, functioning as general-purpose reasoning engines capable of performing hundreds of diverse tasks through natural language interaction. Unlike traditional AI systems narrowly trained for single objectives, modern LLMs exhibit emergent abilities across text generation, multimodal understanding, code synthesis, and complex reasoning — including reasoning models (o1, DeepSeek-R1, Gemini) that scale test-time compute to achieve expert-level performance on scientific, mathematical, and coding benchmarks. Understanding these capabilities matters because choosing the right task formulation directly determines success — the same model can excel or fail based entirely on how you frame the problem, structure the prompt, and select the appropriate inference pattern. A critical insight: LLMs don't execute tasks deterministically like classical programs; they generate probabilistic responses shaped by training data, prompting techniques, and contextual grounding, making reproducibility and factual accuracy ongoing challenges that require deliberate mitigation strategies.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 114 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Text Understanding and Analysis
Parsing the meaning of text is the bedrock skill that every downstream LLM application relies on. The capabilities here range from surface-level classification to deep structural extraction, and getting them right determines how much noise ends up in your data pipelines and RAG retrieval.
| Capability | Example | Description |
|---|---|---|
Summarize this 10-page report in 3 bullet points | • Condenses long documents into shorter versions preserving key information • supports both abstractive (paraphrase) and extractive (verbatim) modes | |
What is the capital of France? → Paris | • Extracts or generates direct answers from context or parametric knowledge • spans factoid, multi-hop, and open-domain QA | |
"I love this product!" → Positive | • Classifies text into sentiment categories (positive, negative, neutral) • detects emotional tone and opinion polarity | |
"Apple CEO Tim Cook" → ORG: Apple, PER: Tim Cook | Identifies and categorizes named entities (people, organizations, locations, dates) within text. | |
"Breaking: Stock market crashes" → Category: Finance | • Assigns predefined category labels to text • supports topic detection, intent classification, and spam filtering | |
Extract all dates and amounts from this invoice | • Identifies and structures specific data from unstructured text • outputs JSON, tables, or key-value pairs |