AI-LLM Task Capabilities Cheat Sheet

Updated 2026-05-28

Large Language Models (LLMs) represent a transformative shift in artificial intelligence, functioning as general-purpose reasoning engines capable of performing hundreds of diverse tasks through natural language interaction. Unlike traditional AI systems narrowly trained for single objectives, modern LLMs exhibit emergent abilities across text generation, multimodal understanding, code synthesis, and complex reasoning — including reasoning models (o1, DeepSeek-R1, Gemini) that scale test-time compute to achieve expert-level performance on scientific, mathematical, and coding benchmarks. Understanding these capabilities matters because choosing the right task formulation directly determines success — the same model can excel or fail based entirely on how you frame the problem, structure the prompt, and select the appropriate inference pattern. A critical insight: LLMs don't execute tasks deterministically like classical programs; they generate probabilistic responses shaped by training data, prompting techniques, and contextual grounding, making reproducibility and factual accuracy ongoing challenges that require deliberate mitigation strategies.

What This Cheat Sheet Covers

This topic spans 14 focused tables and 114 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Text Understanding and AnalysisTable 2: Text Generation and TransformationTable 3: Reasoning and Problem SolvingTable 4: Code and Technical TasksTable 5: Multimodal CapabilitiesTable 6: Conversational and Interactive TasksTable 7: Retrieval and Knowledge TasksTable 8: Structured Output and FormattingTable 9: Advanced Reasoning PatternsTable 10: Tool Use and Agentic WorkflowsTable 11: Creative and Generative TasksTable 12: Data and Document ProcessingTable 13: Safety and Alignment TasksTable 14: Emerging and Specialized Capabilities

Table 1: Text Understanding and Analysis

Parsing the meaning of text is the bedrock skill that every downstream LLM application relies on. The capabilities here range from surface-level classification to deep structural extraction, and getting them right determines how much noise ends up in your data pipelines and RAG retrieval.

Capability	Example	Description
Summarization	`Summarize this 10-page report in 3 bullet points`	• Condenses long documents into shorter versions preserving key information • supports both abstractive (paraphrase) and extractive (verbatim) modes
Question Answering	`What is the capital of France? → Paris`	• Extracts or generates direct answers from context or parametric knowledge • spans factoid, multi-hop, and open-domain QA
Sentiment Analysis	`"I love this product!" → Positive`	• Classifies text into sentiment categories (positive, negative, neutral) • detects emotional tone and opinion polarity
Named Entity Recognition (NER)	`"Apple CEO Tim Cook" → ORG: Apple, PER: Tim Cook`	Identifies and categorizes named entities (people, organizations, locations, dates) within text.
Text Classification	`"Breaking: Stock market crashes" → Category: Finance`	• Assigns predefined category labels to text • supports topic detection, intent classification, and spam filtering
Information Extraction	`Extract all dates and amounts from this invoice`	• Identifies and structures specific data from unstructured text • outputs JSON, tables, or key-value pairs

Table 1: Text Understanding and Analysis

Capability	Example	Description
Summarization	`Summarize this 10-page report in 3 bullet points`	• Condenses long documents into shorter versions preserving key information • supports both abstractive (paraphrase) and extractive (verbatim) modes
Question Answering	`What is the capital of France? → Paris`	• Extracts or generates direct answers from context or parametric knowledge • spans factoid, multi-hop, and open-domain QA
Sentiment Analysis	`"I love this product!" → Positive`	• Classifies text into sentiment categories (positive, negative, neutral) • detects emotional tone and opinion polarity
Named Entity Recognition (NER)	`"Apple CEO Tim Cook" → ORG: Apple, PER: Tim Cook`	Identifies and categorizes named entities (people, organizations, locations, dates) within text.
Text Classification	`"Breaking: Stock market crashes" → Category: Finance`	• Assigns predefined category labels to text • supports topic detection, intent classification, and spam filtering
Information Extraction	`Extract all dates and amounts from this invoice`	• Identifies and structures specific data from unstructured text • outputs JSON, tables, or key-value pairs

AI/LLM Task Capabilities Cheat Sheet

Table 1: Text Understanding and Analysis

AI/LLM Task Capabilities Cheat Sheet

Table 1: Text Understanding and Analysis