AI reasoning models represent a fundamental shift from traditional large language models — instead of predicting the next token immediately, they allocate test-time compute to explore solution paths, verify intermediate steps, and self-correct before producing output. This extended reasoning capability, enabled by reinforcement learning with verifiable rewards (RLVR), allows models to match or exceed human expert performance on mathematics, coding, and scientific reasoning benchmarks. The tradeoff is clear: reasoning models spend more tokens (and cost more per query) in exchange for significantly higher accuracy on hard problems, making the decision of when to use them vs. fast models a critical architectural choice.
What This Cheat Sheet Covers
This topic spans 15 focused tables and 112 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Reasoning Mechanisms
These fundamental techniques define how reasoning models generate extended thought processes before producing final answers, shifting from immediate next-token prediction to multi-step exploration and verification.
| Mechanism | Example | Description |
|---|---|---|
Allocate 10× more inference compute to improve AIME score from 13% → 79% | Dynamically increasing computational resources during inference to explore more solution paths • Scales performance predictably as a function of compute budget • Complements training-time scaling | |
<thinking> Step 1: Check if n is prime... Step 2: Factorize if composite... </thinking> | Internal reasoning steps generated before visible output • Can be hidden (counted but not shown) or visible (displayed to user) • Budget controlled via token limits or effort levels | |
Generate 15,000-token reasoning trace for proof verification vs. 500-token CoT | Substantially longer internal thought processes than chain-of-thought • Enables backtracking, self-correction, and multi-attempt exploration • Unlocks harder problems | |
"Let's break this down: First... Second... Therefore..." | Explicit step-by-step reasoning in natural language • Prompt-induced technique vs. built-in reasoning mode • Shorter and more structured than extended reasoning |