Database query optimization is the systematic process of improving query execution speed and resource efficiency through analysis of execution plans, strategic indexing, and query rewriting. It sits at the intersection of database administration, application performance engineering, and data architecture, directly impacting user experience and infrastructure costs. The key mental model: the query optimizer makes decisions based on statistics and heuristics, but understanding how it thinks—through execution plans—lets you guide it toward better choices when automatic optimization falls short. In 2026, modern databases add AI-driven feedback loops (memory grant feedback, DOP feedback, adaptive joins) that continuously refine plans at runtime, making plan analysis more important than ever.
What This Cheat Sheet Covers
This topic spans 21 focused tables and 196 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Execution Plan Analysis
Reading execution plans is the single most important skill in query optimization — every other technique is guesswork without it. Plans reveal how the optimizer actually executes a query: which access methods it chose, how it joins tables, where it spends time, and where its cardinality estimates diverge from reality.
| Concept | Example | Description |
|---|---|---|
EXPLAIN SELECT * FROM users WHERE id = 1; | Shows the query execution plan without executing the query — reveals access methods, join types, and estimated costs. | |
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending'; | Executes the query and displays actual runtime statistics including real row counts, execution time, and buffers used — critical for identifying planner estimate errors. | |
EXPLAIN (ANALYZE, BUFFERS) SELECT ...; | Shows buffer cache hits vs disk reads — reveals I/O patterns and identifies queries missing working memory. | |
Estimated Number of Rows: 1000 | Optimizer's predicted row count at each operation — large discrepancies from actual rows indicate stale statistics or complex predicates causing poor plan choices. | |
Seq Scan on users | Reads every row sequentially — often indicates missing indexes or non-selective queries where most rows match the filter. | |
Index Scan using idx_email on users | Traverses an index in order, reading multiple rows — used for range queries or when ORDER BY matches index order. | |
Index Seek (NonClustered) | Uses B-tree navigation to jump directly to matching rows — most efficient access method for highly selective equality or narrow range predicates. | |
Key Lookup (Clustered) | • Additional lookup to retrieve columns not included in the index — indicates covering index opportunity • can dominate cost when returning many rows |