Database query optimization is the systematic process of improving query execution speed and resource efficiency through analysis of execution plans, strategic indexing, and query rewriting. It sits at the intersection of database administration, application performance engineering, and data architecture, directly impacting user experience and infrastructure costs. The key mental model: the query optimizer makes decisions based on statistics and heuristics, but understanding how it thinks—through execution plans—lets you guide it toward better choices when automatic optimization falls short.
What This Cheat Sheet Covers
This topic spans 17 focused tables and 124 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Execution Plan Analysis
| Concept | Example | Description |
|---|---|---|
EXPLAIN SELECT * FROM users WHERE id = 1; | • Shows the query execution plan chosen by the optimizer without executing the query • reveals access methods, join types, and estimated costs. | |
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending'; | • Executes the query and displays actual runtime statistics including real row counts, execution time, and buffers used • critical for identifying planner estimate errors. | |
Index Seek, Hash Join, Sort | • Building blocks of query plans that represent specific operations • understanding operators like Nested Loop, Hash Match, and Sort reveals where queries spend time. | |
Seq Scan on users | • Reads every row in a table sequentially • often indicates missing indexes or non-selective queries where most rows match the filter. | |
Index Scan using idx_email on users | • Traverses an index in order, reading multiple rows • used for range queries or when ORDER BY matches index order • slower than seek for highly selective queries. | |
Index Seek (NonClustered) | • Uses B-tree navigation to jump directly to matching rows • most efficient access method for highly selective queries with equality or narrow range predicates. | |
Key Lookup (Clustered) | • Additional lookup to retrieve columns not included in the index • indicates opportunity for covering indexes • can dominate query cost when returning many rows. | |
Nested Loops ├── Index Seek └── Index Seek | • Iterates outer input, probing inner input for each row • efficient when outer input is small and inner has an index on join column • O(n \times m) complexity. |