SQL Cheat Sheet

SQL (Structured Query Language) is the standard language for managing and manipulating relational databases, used across virtually all industries for data retrieval, analysis, and transformation. Whether querying a simple table or orchestrating complex multi-table joins with aggregations and window functions, SQL provides a declarative syntax where you specify what you want, not how to get it — the database engine handles optimization. Mastering SQL means understanding not just the syntax, but the logical execution order (FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY), which fundamentally differs from how queries are written and explains many common pitfalls.

Quick Index190 entries · 27 tables

Mind Map

27 tables, 190 concepts. Select a concept node to jump to its table row.

Preparing mind map...

Table 1: Basic Query Structure

Every SQL query is built from the same six core clauses, but they don't run in the order you write them — the engine processes FROM first and SELECT near the end, which is why aliases work in ORDER BY but not in WHERE. Knowing this logical processing order is the single most useful mental model for explaining why HAVING filters groups while WHERE filters rows, and why row-limiting clauses without ORDER BY return non-deterministic results.

Clause	Example	Description
SELECT	`SELECT name, salary FROM employees`	• Specifies which columns to return and computes their expressions • logically executed near the end (after FROM, WHERE, GROUP BY, HAVING), so its aliases are not visible to those clauses.
FROM	`FROM employees`	• Identifies the source tables, views, or subqueries and applies any JOINs • logically the first clause processed, so its tables and columns are visible to every later clause.
WHERE	`WHERE salary > 50000`	• Filters individual rows before grouping • cannot reference aggregate functions or SELECT-list aliases (SELECT hasn't run yet).
GROUP BY	`GROUP BY department`	• Collapses rows into one row per distinct group so aggregates like SUM and COUNT can be applied • every non-aggregated column in SELECT must also appear here.
HAVING	`HAVING COUNT(*) > 10`	• Filters groups after aggregation • the only clause where aggregate functions are valid filter conditions.
ORDER BY	`ORDER BY salary DESC`	• Sorts the final result set • runs after SELECT, so it can reference SELECT-list aliases; `ASC` is the default direction.
DISTINCT	`SELECT DISTINCT country FROM customers`	• Removes duplicate rows from the result, comparing the full tuple of selected columns • not per-column — every selected column counts toward uniqueness.
LIMIT / TOP / FETCH FIRST	`LIMIT 10` `TOP 10` `FETCH FIRST 10 ROWS ONLY`	• Caps the number of rows returned • syntax varies by dialect: `LIMIT` in PostgreSQL, MySQL, SQLite, Snowflake; `TOP` in SQL Server; `FETCH FIRST` is the ANSI standard supported by Oracle, DB2, Snowflake, and PostgreSQL.
OFFSET	`LIMIT 10 OFFSET 20`	• Skips a number of rows before returning results, typically for pagination • without `ORDER BY` the skipped subset is undefined, so different pages can repeat or miss rows.
WINDOW clause	`SELECT name, RANK() OVER w` `FROM employees` `WINDOW w AS (PARTITION BY dept ORDER BY salary DESC)`	• Defines a named window specification that multiple window functions in the same query can reference via `OVER w` • removes duplication when several aggregates share one partition / order.

Table 2: DDL Statements

Data Definition Language (DDL) shapes the database itself — creating, altering, and dropping the tables, columns, and databases that hold your data. Unlike DML, several DDL statements behave very differently across engines: TRUNCATE is transactional in PostgreSQL and SQL Server but auto-commits in MySQL and Oracle, and rename syntax splits into ANSI ALTER TABLE ... RENAME TO vs SQL Server's sp_rename.

Statement	Example	Description
CREATE TABLE	`CREATE TABLE users (` `id INT PRIMARY KEY,` `name VARCHAR(100) NOT NULL,` `email VARCHAR(255) UNIQUE` `)`	Creates a new table with column definitions, data types, and constraints.
ALTER TABLE ADD COLUMN	`ALTER TABLE employees` `ADD COLUMN phone VARCHAR(20)`	• Adds a new column to an existing table • existing rows get NULL unless a `DEFAULT` is specified.
ALTER TABLE MODIFY / ALTER COLUMN	`ALTER TABLE products` `ALTER COLUMN price DECIMAL(12,2)`	• Changes a column's definition (data type, constraints) • syntax varies (`MODIFY` in MySQL, `ALTER COLUMN` in SQL Server / PostgreSQL).
ALTER TABLE DROP COLUMN	`ALTER TABLE users DROP COLUMN legacy_flag`	• Permanently removes a column and its data • in PostgreSQL it's metadata-only, so disk space isn't reclaimed until the table is rewritten
ALTER TABLE RENAME	`ALTER TABLE customer RENAME TO customers`	• Renames an existing table • syntax varies (`RENAME TO` in PostgreSQL / Oracle / MySQL, `sp_rename` in SQL Server).
DROP TABLE	`DROP TABLE IF EXISTS temp_staging`	• Permanently deletes a table, all its data, indexes, and triggers • `IF EXISTS` prevents an error if already absent; `CASCADE` removes dependent foreign-key constraints and views.
TRUNCATE TABLE	`TRUNCATE TABLE audit_log`	• Removes all rows while preserving table structure • much faster than `DELETE`; does not fire row-level DELETE triggers; rollback works in PostgreSQL / SQL Server but auto-commits in MySQL / Oracle.
CREATE DATABASE	`CREATE DATABASE analytics`	Creates a new database on the server — a top-level container with its own schemas, users, and storage.

Table 3: DML Statements

Data Manipulation Language (DML) statements write to data rather than to the schema: INSERT adds rows, UPDATE modifies them, DELETE removes them, and MERGE does any combination of the three in one set-based statement. Every DML statement that modifies rows can be rolled back inside an explicit transaction — and every one of them silently affects every row in the table if you forget the WHERE clause.

Statement	Example	Description
INSERT INTO ... VALUES	`INSERT INTO orders (customer_id, amount, status)` `VALUES (42, 299.99, 'pending')`	• Inserts one or more rows into a table using literal values • column list is optional, but if omitted the values are matched positionally against the table's declared column order.
INSERT INTO ... SELECT	`INSERT INTO archive` `SELECT * FROM orders WHERE order_date < '2025-01-01'`	• Bulk-inserts rows from a query result; columns are paired positionally between SELECT and the target • standard pattern for copying, archiving, or staging data.
UPDATE ... SET	`UPDATE employees` `SET salary = salary * 1.1` `WHERE department = 'Engineering'`	• Modifies existing rows that match the WHERE condition • without WHERE every row is updated — a classic production-disaster mistake.
UPDATE with JOIN	`UPDATE e` `SET e.dept_name = d.name` `FROM employees e` `JOIN departments d ON e.dept_id = d.id`	• Updates rows using data from a joined table • syntax differs across dialects: T-SQL uses `UPDATE...FROM...JOIN`, MySQL uses `UPDATE t1 JOIN t2...SET`, PostgreSQL uses `UPDATE...FROM` with the join condition in WHERE.
DELETE FROM	`DELETE FROM sessions` `WHERE last_active < NOW() - INTERVAL '30 days'`	• Deletes matching rows and logs each row removal individually • fires row-level triggers and can be rolled back; without WHERE it empties the table (but leaves the schema intact).
MERGE (UPSERT)	`MERGE target AS t` `USING source AS s ON t.id = s.id` `WHEN MATCHED THEN UPDATE SET t.val = s.val` `WHEN NOT MATCHED THEN INSERT (id, val) VALUES (s.id, s.val)` `WHEN NOT MATCHED BY SOURCE THEN DELETE;`	• Combines INSERT, UPDATE, and DELETE in one statement based on a match condition (added to PostgreSQL in v15) • requires a trailing semicolon in SQL Server; under concurrency, `WITH (HOLDLOCK)` is needed to prevent the well-known UPSERT race condition.

Table 4: Join Types

Joins combine rows from two or more tables into a single result. Choosing the right join controls which rows survive when keys don't match on every side, and a misplaced filter or wrong join type is one of the most common sources of silently wrong query results.

Type	Example	Description
INNER JOIN	`SELECT * FROM orders o` `INNER JOIN customers c ON o.customer_id = c.id`	• Returns only rows with matching values in both tables • the default and most common join • rows with NULL keys on either side are excluded because NULL never equals NULL.
LEFT JOIN (LEFT OUTER JOIN)	`SELECT * FROM customers c` `LEFT JOIN orders o ON c.id = o.customer_id`	• Returns all rows from the left table plus matching rows from the right • non-matching right rows produce NULL • a filter on a right-table column in `WHERE` silently converts the join back to an INNER JOIN — put such filters in the `ON` clause instead.
RIGHT JOIN (RIGHT OUTER JOIN)	`SELECT * FROM orders o` `RIGHT JOIN customers c ON o.customer_id = c.id`	• Returns all rows from the right table plus matching rows from the left • mirror of LEFT JOIN; rarely used because flipping the table order to a LEFT JOIN is more readable and more portable.
FULL OUTER JOIN	`SELECT * FROM customers c` `FULL OUTER JOIN orders o ON c.id = o.customer_id`	• Returns all rows from both tables, with NULLs for the side that didn't match • not supported in MySQL — emulate by `UNION`ing a LEFT JOIN with a RIGHT JOIN.
CROSS JOIN	`SELECT * FROM colors CROSS JOIN sizes`	• Produces a Cartesian product — every row from the first table paired with every row from the second (N × M rows) • takes no `ON` condition • the old comma syntax `FROM a, b` produces the same Cartesian product.
SELF JOIN	`SELECT e.name, m.name AS manager` `FROM employees e` `JOIN employees m ON e.manager_id = m.id`	• Joins a table to itself using two different aliases • useful for hierarchical data (employee–manager) or comparing rows within the same table • aliases are mandatory — without them the column references are ambiguous.
LATERAL JOIN (CROSS JOIN LATERAL)	`SELECT d.name, e.name, e.salary` `FROM departments d` `CROSS JOIN LATERAL (` `SELECT name, salary FROM employees` `WHERE department_id = d.id` `ORDER BY salary DESC LIMIT 3` `) e`	• Allows the right-hand subquery to reference columns from tables earlier in the FROM clause, evaluating it once per outer row (like a correlated subquery in the FROM list) • enables top-N-per-group and per-row table-function calls that a plain JOIN cannot express • pair with `LEFT JOIN LATERAL ... ON true` to keep outer rows that produce no inner rows.
CROSS APPLY / OUTER APPLY	`SELECT d.name, e.name` `FROM departments d` `CROSS APPLY (` `SELECT TOP 3 name FROM employees` `WHERE department_id = d.id` `ORDER BY salary DESC` `) e`	• SQL Server's equivalent of LATERAL: evaluates the right-hand table expression once per left-hand row • CROSS APPLY drops left rows whose right expression returns no rows (like INNER JOIN) • OUTER APPLY keeps them with NULLs on the right (like LEFT JOIN).
NATURAL JOIN	`SELECT * FROM employees NATURAL JOIN departments`	• Automatically joins on every column with the same name in both tables and collapses those columns into one • risky in production — adding a same-named column to either table later silently changes the join condition • prefer explicit `ON` or `USING (col)`.

Table 5: Aggregate Functions

Aggregate functions collapse many rows into a single summary value over a group, and almost all of them silently ignore NULL inputs. Two traps catch even experienced users: COUNT(*) counts every row while COUNT(col) skips NULLs in col, and SUM/AVG/MIN/MAX over an all-NULL or empty group return NULL — never zero.

Function	Example	Description
COUNT()	`SELECT COUNT(*) FROM orders` `SELECT COUNT(DISTINCT customer_id) FROM orders`	• `COUNT()` counts all rows* including NULLs • `COUNT(col)` skips NULLs; `DISTINCT` counts unique values.
SUM()	`SELECT SUM(amount) FROM payments`	• Totals all non-NULL values in a column • returns NULL if all values are NULL.
AVG()	`SELECT AVG(salary) FROM employees WHERE dept = 'Sales'`	• Arithmetic mean of non-NULL values • NULLs are excluded, not treated as zero.
MAX()	`SELECT MAX(order_date) FROM orders`	• Returns the largest value in a column • works on dates and strings too.
MIN()	`SELECT MIN(price) FROM products WHERE active = 1`	Returns the smallest value in a column.
GROUP_CONCAT / STRING_AGG	`STRING_AGG(name, ', ') WITHIN GROUP (ORDER BY name)`	• Concatenates string values from a group into a single delimited string • SQL Server/PostgreSQL use STRING_AGG • MySQL uses GROUP_CONCAT.
VARIANCE / VAR_SAMP	`SELECT VAR(salary) FROM employees` `SELECT VARIANCE(salary) FROM employees`	• Statistical variance of the values in a group • VAR/VAR_SAMP (sample), VARP/VAR_POP (population) • name varies by database.
STDDEV / STDEV	`SELECT STDEV(salary) FROM employees`	• Standard deviation of values • STDEV/STDDEV_SAMP (sample), STDEVP/STDDEV_POP (population) • name varies by database.

Table 6: Window Functions — Ranking

The ranking family assigns a position number to each row inside an ordered partition. The differences between them are entirely about how ties and gaps are handled, so picking the wrong one quietly changes results. Every ranking function requires ORDER BY inside its OVER clause; PARTITION BY is optional and restarts the numbering for each group.

Function	Example	Description
ROW_NUMBER()	`ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC)`	• Assigns a unique sequential integer per row within a partition • ties receive different numbers (broken by ORDER BY or arbitrarily).
RANK()	`RANK() OVER (ORDER BY score DESC)`	• Assigns rank with gaps after ties — two rows tied at rank 2 are both rank 2; next row is rank 4 • competition-style ranking.
DENSE_RANK()	`DENSE_RANK() OVER (ORDER BY score DESC)`	• Like RANK but no gaps — two rows tied at rank 2 are both rank 2 • next row is rank 3.
NTILE()	`NTILE(4) OVER (ORDER BY salary)`	• Divides the ordered partition into N buckets as equal as possible • if rows don't divide evenly the earliest buckets get the extra rows.
PERCENT_RANK()	`PERCENT_RANK() OVER (ORDER BY salary)`	• Relative rank `(rank - 1) / (total rows - 1)`, range 0 to 1 inclusive • the first row is always 0.0.
CUME_DIST()	`CUME_DIST() OVER (ORDER BY salary)`	• Cumulative distribution — rows with value ≤ current row, divided by total rows • range 1/N to 1 • tied rows share the same value.

Table 7: Window Functions — Aggregate & Analytical

Window functions compute a value for each row using a window of related rows, without collapsing the result the way GROUP BY does. The window is shaped by three parts of the OVER clause — PARTITION BY (which rows belong together), ORDER BY (their sequence), and the optional frame (ROWS / RANGE — which of those rows are visible from the current one). Most window-function surprises come from the frame: adding ORDER BY without an explicit frame defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which silently turns whole-partition aggregates into running ones and makes LAST_VALUE / NTH_VALUE return the wrong row.

Function	Example	Description
SUM() OVER()	`SUM(amount) OVER (PARTITION BY dept ORDER BY date)`	• Running total within a partition when `ORDER BY` is present • without `ORDER BY`, returns the partition total beside every row.
AVG() OVER()	`AVG(salary) OVER (PARTITION BY department_id)`	• Partition mean on every row when no `ORDER BY` is given • adding `ORDER BY` turns it into a running average.
COUNT() OVER()	`COUNT(*) OVER (PARTITION BY customer_id)`	• Returns the number of rows in the partition beside every detail row • useful for showing a per-group total without `GROUP BY`.
LAG()	`LAG(close_price, 1, 0) OVER (ORDER BY date)`	• Returns the value `offset` rows before the current row in the window • the first row returns `NULL` unless a default is given as the third argument.
LEAD()	`LEAD(close_price, 1, 0) OVER (ORDER BY date)`	• Returns the value `offset` rows after the current row in the window • the last row returns `NULL` unless a default is given as the third argument.
FIRST_VALUE()	`FIRST_VALUE(salary) OVER (PARTITION BY dept ORDER BY hire_date)`	• Returns the value at the first row of the window frame • the default frame starts at the partition's first row, so the result is usually what you expect.
LAST_VALUE()	`LAST_VALUE(salary) OVER (PARTITION BY dept ORDER BY hire_date` `ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)`	• Returns the value at the last row of the window frame • the default frame stops at the current row, so an explicit `ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING` is needed for the true partition last value.
NTH_VALUE()	`NTH_VALUE(salary, 2) OVER (` `PARTITION BY dept ORDER BY salary DESC` `ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)`	• Returns the value at the nth row of the window frame (counting from 1) • returns `NULL` if `n` exceeds the frame, so expand the frame to `UNBOUNDED FOLLOWING` when you need access to rows past the current position.

Table 8: Window Frame Specification

The frame clause narrows a window function's view to a subset of the partition relative to the current row, controlling exactly which rows feed running totals, moving averages, and lookups like LAST_VALUE(). The most important and most surprising rule: when ORDER BY is present but no frame is specified, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (peer-based, not row-based) — which silently fuses tied rows into a single running-sum step.

Clause	Example	Description
ROWS BETWEEN ... AND ...	`SUM(amount) OVER (` `ORDER BY date` `ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)`	• Frame defined by physical row offsets — counts rows regardless of their ORDER BY values • preferred for fixed-size moving calculations like 7-day windows.
RANGE BETWEEN ... AND ...	`SUM(amount) OVER (` `ORDER BY date` `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)`	• Frame defined by logical value ranges — all rows that are ORDER BY peers of the current row are included together • this is the default frame when ORDER BY is present and no frame clause is given.
UNBOUNDED PRECEDING	`ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`	Frame starts at the first row of the current partition — used for cumulative aggregations from the partition start.
UNBOUNDED FOLLOWING	`ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING`	Frame extends to the last row of the current partition — required for `LAST_VALUE()` and `NTH_VALUE()` to see all subsequent rows in standard SQL.
N PRECEDING / N FOLLOWING	`ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING`	• Frame spans N rows before to M rows after the current row • in ROWS mode the offset must be a non-negative integer; in RANGE mode it requires exactly one numeric or date ORDER BY column.
CURRENT ROW	`ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`	• In ROWS mode means just the current row; in RANGE/GROUPS mode it spans all peer rows with the same ORDER BY value • this peer behavior is what produces the running-total surprise on duplicate ordering keys.

Table 9: Subqueries & CTEs

Subqueries nest one query inside another to compute filters, derived tables, or single values, while Common Table Expressions (CTEs) name those intermediate results to break a complex query into readable, sometimes recursive, steps. Knowing when each one short-circuits, gets correlated per row, or is materialized once is the difference between a query that scales and one that crawls.

Technique	Example	Description
Subquery (WHERE)	`SELECT * FROM employees` `WHERE dept_id IN (SELECT id FROM departments WHERE budget > 100000)`	• Embeds a query inside another query for filtering • the inner query runs once (non-correlated) or per row (correlated).
Correlated Subquery	`SELECT * FROM employees e1` `WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e2.dept = e1.dept)`	• References an outer query's columns inside the subquery • logically executes once per outer row, though modern optimizers often decorrelate it into a join.
Subquery (FROM — Derived Table)	`SELECT dept, avg_sal FROM` `(SELECT dept, AVG(salary) AS avg_sal FROM employees GROUP BY dept) t`	• A subquery in the FROM clause • must have a table alias in most dialects • computed once and referenced like a table.
EXISTS	`SELECT name FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)`	• Tests for row existence and stops at the first match • `NOT EXISTS` is also the safe replacement for `NOT IN` when the subquery may contain NULLs.
CTE (Common Table Expression)	`WITH ranked AS (` `SELECT , RANK() OVER (ORDER BY salary DESC) AS rk FROM employees` `)` `SELECT FROM ranked WHERE rk <= 10`	• Named temporary result set introduced with `WITH` and scoped to a single statement • PostgreSQL 12+ may inline (fold) a CTE unless you write `AS MATERIALIZED`.
Recursive CTE	`WITH RECURSIVE org AS (` `SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL` `UNION ALL` `SELECT e.id, e.name, e.manager_id FROM employees e JOIN org o ON e.manager_id = o.id` `)` `SELECT * FROM org`	• CTE that references itself to traverse hierarchical or graph data (org charts, bills of materials) • requires an anchor member, `UNION ALL` (or `UNION`) with a recursive member, and a condition that eventually returns no rows.
Scalar Subquery	`SELECT name, (SELECT MAX(salary) FROM employees) AS max_sal FROM employees`	• A subquery used as a single value (one row, one column) in `SELECT`, `WHERE`, etc. • returns `NULL` if zero rows and raises a cardinality error if it returns more than one.

Table 10: Set Operations

Set operators combine the rows of two or more SELECT queries that return the same number of columns with compatible data types. The result uses the column names from the first query, any ORDER BY must come at the very end of the combined statement, and NULLs are treated as equal for the purposes of duplicate elimination — a deliberate exception to SQL's usual three-valued logic.

Operation	Example	Description
UNION	`SELECT city FROM customers UNION SELECT city FROM suppliers`	• Combines result sets and removes duplicates (same logic as `DISTINCT`) • slower than `UNION ALL` because of the extra deduplication step.
UNION ALL	`SELECT product_id FROM orders_2024 UNION ALL SELECT product_id FROM orders_2025`	• Combines result sets including all duplicates • faster than `UNION` — the canonical guidance is to default to `UNION ALL` and only use `UNION` when you actually need duplicates removed.
INTERSECT	`SELECT customer_id FROM orders_jan INTERSECT SELECT customer_id FROM orders_feb`	• Returns rows that appear in both result sets • result is deduplicated; some dialects (e.g. PostgreSQL) also support `INTERSECT ALL` to keep duplicate matches.
EXCEPT (MINUS)	`SELECT customer_id FROM customers EXCEPT SELECT customer_id FROM orders`	• Returns rows in the first set that do not appear in the second • `EXCEPT` is the ANSI name (PostgreSQL, SQL Server, MySQL 8.0.31+); Oracle uses `MINUS` • result is deduplicated.

Table 11: Advanced Grouping

GROUP BY extensions add multi-level subtotals and grand totals in a single pass over the data, replacing stacked UNION ALL queries with concise, optimizer-friendly syntax. ROLLUP gives strict hierarchical subtotals, CUBE gives every combination, and GROUPING SETS lets you cherry-pick exactly the sets you want; GROUPING() and GROUPING_ID() then let you tell super-aggregate rows apart from rows that happen to contain real NULLs.

Extension	Example	Description
ROLLUP	`SELECT year, quarter, SUM(sales)` `FROM revenue` `GROUP BY ROLLUP(year, quarter)`	• Generates hierarchical subtotals — produces N+1 grouping sets for N columns (each prefix of the column list plus the grand total) • e.g., totals by (year, quarter), (year), and grand total in one query.
CUBE	`SELECT dept, year, SUM(revenue)` `FROM sales GROUP BY CUBE(dept, year)`	• Generates every subset of the listed columns — 2^N grouping sets for N columns (the full power set) • includes every cross-tabulation plus the grand total.
GROUPING SETS	`GROUP BY GROUPING SETS` `((dept, year), (dept), (year), ())`	• Explicit control over which grouping combinations to compute — no implicit hierarchy, no full crossing • avoids paying for CUBE combinations you don't need.
GROUPING()	`SELECT GROUPING(dept), dept, SUM(sal)` `FROM employees GROUP BY ROLLUP(dept)`	• Returns 1 for super-aggregate rows (where the column was rolled up to NULL) and 0 for normal rows • the only reliable way to distinguish ROLLUP/CUBE nulls from actual NULL values in the data.
GROUPING_ID()	`SELECT GROUPING_ID(dept, year), dept, year, SUM(sal)` `FROM employees GROUP BY ROLLUP(dept, year)`	• Returns an integer bitmap where each bit is the `GROUPING()` value for that column — identifies the row's grouping level in a single value • cleaner than multiple GROUPING() calls when you need to label, sort, or filter by level.

Table 12: PIVOT / UNPIVOT

PIVOT and UNPIVOT reshape result sets between long (one row per value) and wide (one column per category) layouts, which is the core operation behind most cross-tab reports. They are T-SQL extensions also supported in Oracle and Snowflake (with dialect-specific syntax), but PostgreSQL and MySQL have no native PIVOT — for those engines, and for any portable code, use conditional aggregation with CASE inside aggregate functions.

Operator	Example	Description
PIVOT	`SELECT dept, [2023], [2024], [2025]` `FROM sales_data` `PIVOT (` `SUM(amount) FOR year IN ([2023],[2024],[2025])` `) AS pvt`	• Rotates row values into columns — aggregates a measure and spreads distinct values from one column into separate columns • Aggregate function is always required, even when each cell maps to one row; values not in the `IN` list are silently dropped.
UNPIVOT	`SELECT dept, year, amount` `FROM wide_table` `UNPIVOT (` `amount FOR year IN ([2023],[2024],[2025])` `) AS unpvt`	• Reverses a pivot — turns column headers back into row values to normalise wide tables • In SQL Server, `NULL`-valued input cells produce no output row (Oracle exposes `INCLUDE NULLS` / `EXCLUDE NULLS`; T-SQL defaults to drop).
Dynamic PIVOT (Conditional Aggregation)	`SELECT dept,` `SUM(CASE WHEN year=2024 THEN sales END) AS [2024],` `SUM(CASE WHEN year=2025 THEN sales END) AS [2025]` `FROM sales GROUP BY dept`	• ANSI-standard, portable alternative to `PIVOT` using `CASE` inside aggregate functions — runs unchanged on SQL Server, PostgreSQL, MySQL, Oracle, SQLite • Also the standard workaround when pivoted columns are determined at runtime, paired with dynamic SQL to build the column list.

Table 13: Conditional Logic

Conditional expressions let a query branch row-by-row inside SELECT, WHERE, ORDER BY, and UPDATE clauses without leaving SQL. CASE is the portable, ANSI-standard primitive; vendors layer terser shorthands like IIF, CHOOSE, and Oracle's DECODE on top — each with its own NULL-handling quirks worth knowing.

Expression	Example	Description
CASE WHEN ... THEN ... END	`CASE WHEN salary > 100000 THEN 'Senior'` `WHEN salary > 60000 THEN 'Mid'` `ELSE 'Junior'` `END`	• SQL's primary conditional expression • evaluates WHEN conditions top-down and returns the first match — later conditions are skipped • returns `NULL` if no WHEN matches and `ELSE` is omitted.
Simple CASE	`CASE status` `WHEN 'A' THEN 'Active'` `WHEN 'I' THEN 'Inactive'` `ELSE 'Unknown'` `END`	• Equality shorthand — compares one expression to a list of values using `=` • cannot match `NULL` (because `NULL = NULL` is UNKNOWN) — use searched `CASE WHEN ... IS NULL` for that.
IIF()	`IIF(salary > 50000, 'High', 'Low')`	• Ternary shorthand for a two-branch CASE (SQL Server / Access) • rewritten internally as `CASE`; nesting is capped at 10 levels, same as CASE.
CHOOSE()	`CHOOSE(quarter, 'Q1', 'Q2', 'Q3', 'Q4')`	• Returns the Nth item from a value list using a 1-based integer index • SQL Server only; an out-of-range or `NULL` index returns `NULL` (no error).
DECODE()	`DECODE(status, 1, 'Active', 0, 'Inactive', 'Unknown')`	• Oracle-proprietary equality-based CASE substitute: `DECODE(expr, search, result [, ...] [, default])` • uniquely treats two `NULL`s as equal — opposite of standard `CASE`.

Table 14: String Functions

String functions transform, inspect, and assemble text values directly inside SQL. Their behavior — especially around NULL, trailing spaces, and which dialect supports which syntax — varies meaningfully between SQL Server, PostgreSQL, MySQL, and Oracle, so the same expression can return different results depending on the engine.

Function	Example	Description
CONCAT()	`CONCAT(first_name, ' ', last_name)`	• Joins strings end-to-end • in SQL Server and MySQL, `NULL` arguments are converted to empty strings and ignored • PostgreSQL/Oracle/SQLite also support the `\|\|` operator, which by contrast returns `NULL` if any operand is `NULL`.
CONCAT_WS()	`CONCAT_WS(', ', city, state, country)`	• Concatenates with a separator — first argument is the separator • ignores `NULL` arguments and does NOT emit the separator for them (so the separator only ever appears between non-NULL values); SQL Server 2017+, PostgreSQL, MySQL.
UPPER() / LOWER()	`UPPER(last_name)`	• Converts a string to uppercase or lowercase according to the database's locale • case folding is locale-sensitive — the Turkish dotted/dotless I is a classic gotcha.
TRIM() / LTRIM() / RTRIM()	`TRIM(LEADING ' ' FROM name)`	• Removes leading, trailing, or both spaces (or specified characters) • standard SQL `TRIM` supports `LEADING` / `TRAILING` / `BOTH` • SQL Server gained `TRIM` in 2017; the `LEADING` / `TRAILING` / `BOTH` keywords require SQL Server 2022 with compatibility level 160.
SUBSTRING() / SUBSTR()	`SUBSTRING(email, 1, CHARINDEX('@', email) - 1)`	• Extracts a portion of a string given a 1-based start position and optional length • positions start at 1 in every major dialect; Oracle's `SUBSTR` also accepts a negative start to count from the end of the string, but SQL Server's `SUBSTRING` does not.
LEFT() / RIGHT()	`LEFT(phone, 3)` `RIGHT(zip_code, 4)`	• Extracts N characters from the left or right end of a string • a convenient shorthand for `SUBSTRING`.
LEN() / LENGTH()	`WHERE LEN(description) > 500`	• Returns the number of characters in a string • SQL Server's `LEN` silently excludes trailing spaces — use `DATALENGTH` if you need the byte length including them • PostgreSQL/MySQL's `LENGTH` counts everything.
REPLACE()	`REPLACE(phone, '-', '')`	• Substitutes all occurrences (not just the first) of a substring with another string • in SQL Server the match honors the column's collation, which is case-insensitive by default.
CHARINDEX() / INSTR() / POSITION()	`CHARINDEX('@', email)`	• Returns the 1-based character position of a substring, or `0` if not found (never `NULL` unless an argument is `NULL`) • `CHARINDEX` in SQL Server, `INSTR` in Oracle/MySQL, `POSITION(substr IN str)` in PostgreSQL.
STRING_SPLIT()	`SELECT value FROM STRING_SPLIT('a,b,c', ',')`	• Splits a string on a single-character delimiter and returns a one-column table • SQL Server 2016+ • output order is not guaranteed unless you pass `enable_ordinal => 1` (SQL Server 2022+) and `ORDER BY ordinal` • PostgreSQL uses `string_to_table` / `regexp_split_to_table`.
LPAD() / RPAD()	`LPAD(CAST(id AS VARCHAR), 8, '0')` `RPAD(name, 20, ' ')`	• Pads a string to a specified length by prepending/appending fill characters; truncates from the opposite end if the input is already longer • PostgreSQL/MySQL/Oracle have these built in; SQL Server has no native `LPAD` — the usual workaround is `RIGHT('00000000' + CAST(id AS VARCHAR), 8)`.
INITCAP()	`INITCAP('john SMITH')`	• Capitalizes the first letter of each word and lowercases the rest, where "word" means a run of alphanumeric characters separated by non-alphanumerics • PostgreSQL and Oracle only — no direct equivalent in SQL Server.
FORMAT()	`FORMAT(amount, 'C2', 'en-US')`	• Formats a value as a string using .NET format strings with optional culture (SQL Server only) • supports currency, dates, and numbers with locale awareness, but relies on the CLR and is notably slower than `CONVERT` on large row counts.

Table 15: Date & Time Functions

Date and time handling varies more between dialects than almost any other area of SQL — function names, return types, and time-zone semantics all differ. The functions below cover the common needs: getting the current moment, doing arithmetic on dates, extracting parts, truncating to a unit, and formatting for display. Watch for two traps that bite even experienced developers: DATEDIFF counts boundary crossings rather than elapsed time, and NOW() in PostgreSQL returns the transaction start time, not the actual clock.

Function	Example	Description
GETDATE() / NOW() / CURRENT_TIMESTAMP	`SELECT GETDATE()`	• Returns the current date and time • `GETDATE()` in SQL Server returns server local time; use `GETUTCDATE()` or `SYSUTCDATETIME()` for UTC • `NOW()` in PostgreSQL/MySQL and `CURRENT_TIMESTAMP` (ANSI) work across dialects.
DATEADD() / DATE_ADD() / INTERVAL	`DATEADD(month, 3, hire_date)` `hire_date + INTERVAL '3 months'`	• Adds a time interval to a date; pass a negative number to subtract • `DATEADD` in SQL Server, `DATE_ADD` in MySQL, `+ INTERVAL` in PostgreSQL/Oracle.
DATEDIFF()	`DATEDIFF(day, start_date, end_date)`	• Returns the number of datepart boundaries crossed between two dates — not strictly elapsed time • `DATEDIFF(year, '2024-12-31', '2025-01-01')` returns `1` even though the dates are 1 day apart • SQL Server / MySQL syntax; PostgreSQL uses `EXTRACT` or subtraction.
DATEPART() / EXTRACT()	`DATEPART(year, hire_date)` `EXTRACT(YEAR FROM hire_date)`	• Returns a specific date component (year, month, day, hour, etc.) as an integer • `DATEPART` is T-SQL; `EXTRACT` is the ANSI-standard form supported by PostgreSQL, MySQL, Oracle, and SQL Server 2022+.
DATE_TRUNC()	`DATE_TRUNC('month', created_at)`	• Truncates a timestamp to a specified precision (year, quarter, month, day, hour) — ideal for grouping by month or quarter • PostgreSQL / Snowflake / Redshift; SQL Server 2022+ provides `DATETRUNC()`.
CAST() (to date/time)	`CAST('2025-01-15' AS DATE)`	• Converts a string to a date/datetime value (ANSI standard, works across dialects) • In SQL Server, ambiguous formats like `'01/02/2025'` depend on session `SET DATEFORMAT` / language; ISO 8601 (`YYYY-MM-DD`) is always safe.
TO_CHAR() / FORMAT()	`TO_CHAR(order_date, 'YYYY-MM')` `FORMAT(order_date, 'yyyy-MM')`	• Formats a date as a string with a template pattern — useful for display or grouping by year-month • `TO_CHAR` in PostgreSQL/Oracle/Snowflake; `FORMAT` in SQL Server 2012+ and MySQL.
EOMONTH()	`EOMONTH(GETDATE())`	• Returns the last day of the month for a given date (as a `date`, with the time component dropped) • Optional second argument adds/subtracts months before computing month-end • SQL Server 2012+ and MySQL.
CONVERT() (date format)	`CONVERT(VARCHAR, GETDATE(), 103)`	• SQL Server's formats a date to a string using a numeric style code • Style `103` = `dd/mm/yyyy`, `120` = `yyyy-mm-dd hh:mi:ss`, `126` = ISO 8601 • SQL Server only; other dialects use `TO_CHAR` or `FORMAT`.

Table 16: Numeric Functions

Numeric functions handle the arithmetic, rounding, and randomness work that lives outside of plain + - * /. Dialect quirks bite hard here — what LOG(100) returns, whether ROUND rounds halves away from zero or to-even, and whether RAND() re-runs per row all change between vendors.

Function	Example	Description
ROUND()	`ROUND(123.45, -1)` → `120`	• Rounds to N decimal places • negative N rounds to tens, hundreds, etc. • SQL Server and PostgreSQL `numeric` break ties away from zero.
FLOOR() / CEILING()	`FLOOR(-4.7)` → `-5` `CEILING(4.2)` → `5`	• FLOOR returns the largest integer ≤ value (rounds toward −∞) • CEILING returns the smallest integer ≥ value (rounds toward +∞).
ABS()	`ABS(balance - target)`	• Returns absolute value &mdash • strips the sign • Can overflow on the minimum signed integer (e.g. `ABS(-2147483648)` in SQL Server).
POWER() / SQRT()	`POWER(2, 10)` → `1024` `SQRT(144)` → `12`	• POWER(x, y) raises x to the y-th power • SQRT returns the square root; negative input raises a domain error in most engines.
MOD() / %	`SELECT 17 % 5` → `2` `MOD(17, 5)` → `2`	• Returns the remainder after division • `%` in SQL Server/PostgreSQL, `MOD()` in Oracle/MySQL/PostgreSQL • In Postgres the result inherits the sign of the dividend.
LOG() / LOG10() / LN()	`LOG(EXP(1))` → `1` (SQL Server) `LOG(100)` → `2` (PostgreSQL)	• SQL Server / MySQL: `LOG(x)` = natural log; `LOG(x, base)` for arbitrary base in SQL Server • PostgreSQL: `LOG(x)` = base-10; use `LN(x)` for natural log • `LOG10` is always base-10.
SIGN()	`SIGN(balance)` → `-1`, `0`, or `1`	• Returns −1, 0, or +1 for negative, zero, or positive input • useful for directional logic without `CASE`.
RAND() / RANDOM()	`SELECT RAND()` (SQL Server) `SELECT RANDOM()` (PostgreSQL)	• Returns a pseudo-random float between 0 and 1 • `RAND` in SQL Server/MySQL, `RANDOM()` in PostgreSQL • In SQL Server, `RAND()` evaluates once per query — use `NEWID()` for per-row randomness.
TRUNC() / TRUNCATE()	`TRUNC(3.987, 2)` → `3.98` `TRUNC(-4.7)` → `-4`	• Chops (does not round) to N decimal places • Oracle/PostgreSQL use `TRUNC`, MySQL uses `TRUNCATE` • Differs from `FLOOR` for negatives (`TRUNC` rounds toward zero, `FLOOR` toward −∞).

Table 17: NULL Handling

SQL treats NULL as "unknown" under three-valued logic, so most operators that touch a NULL return UNKNOWN rather than TRUE or FALSE — which is why WHERE col = NULL never matches anything and why NOT IN against a subquery containing a NULL silently returns zero rows. The functions and predicates below are the safe, portable tools for testing, substituting, and aggregating NULL data across every major dialect.

Technique	Example	Description
IS NULL / IS NOT NULL	`WHERE phone IS NULL` `WHERE email IS NOT NULL`	• Tests for NULL values • the only safe way — `= NULL` and `!= NULL` always evaluate to UNKNOWN.
COALESCE()	`COALESCE(mobile, home_phone, 'N/A')`	• Returns the first non-NULL argument (NULL only if all are NULL) • ANSI standard • accepts any number of arguments • short-circuits left-to-right.
NULLIF()	`NULLIF(denominator, 0)`	• Returns NULL if both arguments are equal, otherwise the first argument • classic use prevents division-by-zero: `value / NULLIF(denom, 0)`.
ISNULL() / IFNULL() / NVL()	`ISNULL(discount, 0)`	• Two-argument NULL substitution (vendor-specific) • SQL Server `ISNULL`, MySQL `IFNULL`, Oracle `NVL` • less portable than COALESCE.
NULL in aggregates	`AVG(score)` (NULL ignored) `COUNT(*)` (NULL included)	• Most aggregates skip NULLs in both numerator and denominator • `COUNT(*)` counts all rows; `COUNT(col)` skips NULLs.
NULL in JOINs	`a LEFT JOIN b ON a.id = b.id` `WHERE b.id IS NULL -- anti-join`	• NULL keys never match in join conditions • filtering an outer-join column in `WHERE` quietly converts LEFT JOIN to INNER JOIN • `NOT IN (subquery)` returns zero rows if the subquery yields any NULL — use `NOT EXISTS` instead.

Table 18: Pattern Matching

SQL provides several pattern-matching predicates for filtering rows by text shape — from the simple wildcard-based LIKE to dialect-specific regex operators and full-text search. Which one you reach for depends on how complex the pattern is and which engine you're on.

Technique	Example	Description
LIKE	`WHERE name LIKE 'A%'` `WHERE code LIKE '_-[0-9][0-9]'`	• `%` matches any sequence of characters; `_` matches exactly one character • SQL Server also supports character classes `[a-z]`.
ILIKE	`WHERE name ILIKE '%smith%'`	• Case-insensitive LIKE • PostgreSQL only • equivalent to `LIKE` with `LOWER()`.
NOT LIKE	`WHERE email NOT LIKE '%@example.com'`	Inverts pattern match — returns rows where the value does not match the pattern.
REGEXP_LIKE / SIMILAR TO / ~ (tilde)	`REGEXP_LIKE(phone, '^\d{3}-\d{4}$')` (Oracle/MySQL) `phone ~ '^\d{3}-\d{4}$'` (PostgreSQL)	• Full regular expression matching • REGEXP_LIKE in Oracle/MySQL, `~` in PostgreSQL, SIMILAR TO is SQL standard but limited; REGEXP_MATCH in PostgreSQL extracts matches.
CONTAINS	`WHERE CONTAINS(description, 'fast AND reliable')`	• Full-text search predicate (SQL Server) • requires a full-text index; supports boolean operators and proximity searches.
ESCAPE clause	`WHERE path LIKE '100\%' ESCAPE '\'`	Treats special LIKE metacharacters (`%`, `_`) as literals when prefixed with the escape character.

Table 19: Data Type Conversion

Explicit conversion functions turn one data type into another and decide what happens when the input doesn't fit. The big split is error vs NULL on failure: CAST / CONVERT / PARSE raise errors, while TRY_CAST / TRY_CONVERT / TRY_PARSE return NULL — much safer for dirty user input. Choosing the right one (and pairing column types in WHERE filters) also has real performance impact, because implicit conversions can stop SQL Server from using an index.

Function	Example	Description
CAST()	`CAST('3.14' AS DECIMAL(10,2))`	• ANSI-standard explicit type conversion — works in T-SQL, PostgreSQL, MySQL, Oracle, Snowflake, DuckDB • raises an error if the conversion fails.
CONVERT()	`CONVERT(VARCHAR, GETDATE(), 120)`	• SQL Server type conversion with an optional `style` argument for date/number formatting (e.g., 120 = ODBC canonical `yyyy-mm-dd hh:mi:ss`) • MySQL `CONVERT` has no style code.
TRY_CAST()	`TRY_CAST(user_input AS INT)`	• Safe CAST (SQL Server 2012+) — returns NULL instead of an error when the conversion fails • ideal for validating user input or dirty data.
TRY_CONVERT()	`TRY_CONVERT(DATE, date_string, 101)`	• Safe CONVERT (SQL Server 2012+) — returns NULL on conversion failure • supports the same style codes as `CONVERT`.
TRY_PARSE()	`TRY_PARSE('€1.234,56' AS MONEY USING 'de-DE')`	• Culture-aware string parsing to date/time or number types — returns NULL on failure • relies on the .NET CLR, so slower than `TRY_CAST`.
TO_NUMBER() / TO_DATE() / TO_TIMESTAMP()	`TO_DATE('05 Dec 2000', 'DD Mon YYYY')` `TO_NUMBER('12,454.8', '99G999D9')`	• PostgreSQL and Oracle format-model conversions • the explicit format string controls how the input is parsed.
PARSE()	`PARSE('15 Janvier 2025' AS DATE USING 'fr-FR')`	• SQL Server culture-aware string-to-date/number parse; raises an error on failure (use `TRY_PARSE` for the safe variant) • depends on the .NET CLR.
STR()	`STR(123.45, 6, 1)` returns `' 123.5'`	• SQL Server float-to-character conversion with fixed length and decimal places • right-justifies with leading spaces and returns asterisks (`**`) when the length is too small.

Table 20: Constraints & Keys

Constraints enforce data integrity at the column or table level, rejecting writes that would violate the rules you declare in CREATE TABLE or ALTER TABLE. Watch the NULL handling closely — UNIQUE, CHECK, and FOREIGN KEY each treat NULL differently, and the defaults differ across SQL Server, PostgreSQL, MySQL, and Oracle.

Constraint	Example	Description
PRIMARY KEY	`id INT PRIMARY KEY`	• Uniquely identifies each row — enforces NOT NULL + UNIQUE together • one per table; may span multiple columns (composite PK) • creates a unique clustered index by default in SQL Server (override with `PRIMARY KEY NONCLUSTERED`); PostgreSQL has no clustered-index concept.
FOREIGN KEY	`FOREIGN KEY (dept_id) REFERENCES departments(id)`	Enforces referential integrity — the value must match a row in the referenced table or be NULL (FK columns are nullable unless you also add `NOT NULL`).
NOT NULL	`name VARCHAR(100) NOT NULL`	• Rejects NULL values on INSERT and UPDATE • independent from `DEFAULT` — supplying an explicit NULL still raises an error even if a default exists.
UNIQUE	`email VARCHAR(255) UNIQUE`	• Ensures all non-NULL values are distinct and creates a backing unique index • PostgreSQL / MySQL / Oracle allow multiple NULLs (NULL ≠ NULL); SQL Server's basic UNIQUE allows only one NULL — use a filtered unique index (`WHERE col IS NOT NULL`) for multi-NULL uniqueness.
CHECK	`salary DECIMAL CHECK (salary >= 0)`	• Validates a Boolean expression on INSERT and UPDATE • NULL is treated as UNKNOWN, not FALSE — a CHECK constraint passes when the expression is TRUE or NULL, so NULLs slip through unless you also add `NOT NULL`.
DEFAULT	`created_at DATETIME DEFAULT GETDATE()`	• Supplies a value when the column is omitted from INSERT or you write the `DEFAULT` keyword • an explicit NULL is NOT replaced by the default — it goes in as NULL (or errors if the column is NOT NULL).
ON DELETE CASCADE	`FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE`	• Automatically deletes child rows when the referenced parent row is deleted • alternatives: `SET NULL` nullifies the child FK column, `SET DEFAULT` uses the child's default, `NO ACTION` / `RESTRICT` block the parent delete.

Table 21: Indexes

Indexes are the single biggest performance lever in SQL: they turn full table scans into fast lookups by keeping a sorted, pointer-based structure (usually a B-tree) over chosen columns. They are pure redundancy though — every INSERT, UPDATE, or DELETE that touches an indexed column also rewrites the index — so over-indexing is just as damaging as missing indexes, and column order in multi-column indexes is rarely a free choice.

Concept	Example	Description
CREATE INDEX	`CREATE INDEX idx_last_name ON customers (last_name)`	Creates a non-unique index (B-tree by default) to speed up queries that filter, sort, or join on the column(s). Indexes are redundant copies — they cost storage and slow writes.
CREATE UNIQUE INDEX	`CREATE UNIQUE INDEX idx_email ON users (email)`	• Enforces uniqueness on the indexed column(s) while accelerating lookups • functionally equivalent to a UNIQUE constraint (which is implemented as a unique index behind the scenes).
Clustered Index	`CREATE CLUSTERED INDEX idx_order_id ON orders (order_id)`	• Physically orders the table rows by the index key — the table data IS the clustered index's leaf level • only one per table; in SQL Server, PRIMARY KEY creates a clustered index by default. PostgreSQL has no clustered-index concept (only a one-time `CLUSTER` reorder command).
Non-clustered Index	`CREATE NONCLUSTERED INDEX idx_dept ON employees (department_id)`	• Maintains a separate structure with key values plus row locators back to the heap or clustered index • multiple allowed per table; fast for selective lookups but each non-covered query needs a key lookup.
Composite Index	`CREATE INDEX idx_name_dept ON employees (last_name, department_id)`	• Index on multiple columns in a defined order — usable only when the query filters on a leftmost prefix of those columns • an index on `(A, B)` helps `WHERE A=…` and `WHERE A=… AND B=…`, but not `WHERE B=…` alone.
Covering Index (INCLUDE)	`CREATE INDEX idx_cust ON orders (customer_id)` `INCLUDE (amount, order_date)`	• Adds non-key columns to the index leaf via `INCLUDE` (SQL Server, PostgreSQL 11+) • query is satisfied entirely from the index with no key lookup back to the table — a big read-path win for hot queries.
Filtered Index (Partial Index)	`CREATE INDEX idx_active ON users (email)` `WHERE active = 1`	• Indexes only the rows matching a `WHERE` predicate ("filtered index" in SQL Server, "partial index" in PostgreSQL) • smaller, cheaper to maintain, and useful when queries usually target a known subset.
DROP INDEX	`DROP INDEX idx_last_name ON customers`	• Removes an index and frees its storage • syntax varies: SQL Server uses `DROP INDEX name ON table`; PostgreSQL and MySQL use `DROP INDEX name` (MySQL also supports `ALTER TABLE … DROP INDEX`). Cannot drop indexes backing a PRIMARY KEY or UNIQUE constraint — drop the constraint instead.

Table 22: Views

Views package a SELECT statement under a name so it can be queried like a table. Standard views store no data — the underlying query re-runs on every reference, which is cheap on storage but offers no speed-up over the raw query. Materialized views flip that trade-off by persisting the result on disk for fast reads at the cost of staleness until the next refresh.

Technique	Example	Description
CREATE VIEW	`CREATE VIEW active_customers AS` `SELECT * FROM customers WHERE active = 1`	• Creates a named, saved query that can be queried like a table • not physically materialized — the defining query is re-run on every reference, so storage is zero but there is no performance gain over the underlying SELECT.
CREATE OR REPLACE VIEW	`CREATE OR REPLACE VIEW monthly_sales AS` `SELECT EXTRACT(MONTH FROM order_date) AS month, SUM(amount) FROM orders GROUP BY 1`	• Atomically replaces an existing view's defining query without dropping it • PostgreSQL, MySQL, Oracle • the new query must produce the same column names, order, and types — extra columns may only be appended at the end • SQL Server uses `ALTER VIEW` or `CREATE OR ALTER VIEW` (2016 SP1+) instead.
WITH CHECK OPTION	`CREATE VIEW eng_staff AS` `SELECT * FROM employees WHERE dept = 'Engineering'` `WITH CHECK OPTION`	• Rejects INSERT, UPDATE, or MERGE through the view if the resulting row would no longer satisfy the view's WHERE clause • `LOCAL` checks only the current view; `CASCADED` (the default if neither is specified) also checks every underlying view.
Materialized View	`CREATE MATERIALIZED VIEW report_summary AS` `SELECT dept, SUM(salary) FROM employees GROUP BY dept;` `REFRESH MATERIALIZED VIEW CONCURRENTLY report_summary;`	• Stores query results physically — SELECTs are fast but data is stale until the next `REFRESH MATERIALIZED VIEW` • a plain refresh takes an exclusive lock; `CONCURRENTLY` (PG 9.4+) avoids blocking readers but requires a UNIQUE index on the view • PostgreSQL and Oracle; SQL Server's equivalent is an indexed view (auto-maintained on every base-table change, no manual refresh).
DROP VIEW	`DROP VIEW IF EXISTS active_customers`	• Removes only the view definition — underlying tables and their rows are unaffected • `RESTRICT` (the default) blocks the drop if other objects depend on the view; `CASCADE` drops them too.

Table 23: Transactions

Transactions group one or more SQL statements into an atomic unit so the database can guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). The statements below mark transaction boundaries, set the visibility rules between concurrent sessions, and (in the case of WITH (NOLOCK)) trade correctness for raw read speed in SQL Server.

Statement	Example	Description
BEGIN TRANSACTION	`BEGIN TRANSACTION`	• Starts an explicit transaction block — subsequent changes are held until COMMIT or ROLLBACK • `BEGIN` / `START TRANSACTION` in PostgreSQL and MySQL; most engines default to autocommit (each statement is its own transaction) unless wrapped.
COMMIT	`COMMIT`	• Permanently saves all changes made since the transaction started and frees the transaction's locks and resources • cannot be undone — once committed, ROLLBACK no longer applies.
ROLLBACK	`ROLLBACK`	• Discards all changes made since BEGIN (or since a named SAVEPOINT, with `ROLLBACK TO`) • used for error recovery and to abort transactions cleanly.
SAVEPOINT	`SAVEPOINT before_update` `ROLLBACK TO before_update`	• Creates a named checkpoint inside a transaction • `ROLLBACK TO savepoint` undoes work back to that point without ending the transaction — the outer transaction stays open and can still COMMIT.
SET TRANSACTION ISOLATION LEVEL	`SET TRANSACTION ISOLATION LEVEL READ COMMITTED`	• Controls which concurrency anomalies are allowed — dirty reads, non-repeatable reads, phantom reads • Levels (ascending strictness): READ UNCOMMITTED → READ COMMITTED → REPEATABLE READ → SERIALIZABLE; SQL Server adds SNAPSHOT. Defaults vary: READ COMMITTED in SQL Server / PostgreSQL / Oracle, REPEATABLE READ in MySQL InnoDB.
WITH (NOLOCK)	`SELECT * FROM orders WITH (NOLOCK)`	• SQL Server table hint equivalent to READ UNCOMMITTED for that table — reads without acquiring shared locks • can return uncommitted (dirty) data, duplicate rows, or miss previously-committed rows; a notorious anti-pattern when used as a "go-fast" switch.

Table 24: Error Handling

SQL Server's structured error handling centers on TRY...CATCH: any runtime error of severity above 10 inside the TRY block jumps execution to the matching CATCH block, where a family of ERROR_* functions exposes the message, number, severity, line, and procedure of the original error. Inside CATCH you also check XACT_STATE() to decide whether to commit or roll back, and re-surface errors with THROW (preferred since SQL Server 2012) or the older RAISERROR.

Construct	Example	Description
TRY...CATCH	`BEGIN TRY` `INSERT INTO orders VALUES (1, 99.99)` `END TRY` `BEGIN CATCH` `SELECT ERROR_MESSAGE(), ERROR_NUMBER()` `END CATCH`	• Structured error handling (SQL Server / Azure SQL) • any runtime error of severity > 10 in TRY transfers control to CATCH; compile and name-resolution errors at the same scope are NOT caught.
ERROR_MESSAGE()	`SELECT ERROR_MESSAGE()`	• Returns the error message text of the error that triggered the CATCH block • returns NULL outside CATCH
ERROR_NUMBER()	`SELECT ERROR_NUMBER()`	• Returns the error number — correlates to `sys.messages` for system errors • user-defined errors use 50000+ • unlike `@@ERROR`, the value persists across the whole CATCH scope.
XACT_STATE()	`IF XACT_STATE() = -1 ROLLBACK` `IF XACT_STATE() = 1 COMMIT`	• Returns 1 (active committable), -1 (active uncommittable — must rollback), or 0 (no active transaction) • essential check in CATCH blocks before deciding to COMMIT or ROLLBACK.
THROW	`THROW 50001, 'Invalid customer ID', 1` `-- or to re-throw:` `THROW`	• Raises a user-defined error (number ≥ 50000, severity always 16) or re-throws the original error with its original details (bare `THROW` in CATCH) • SQL Server 2012+; preferred over `RAISERROR` for new code; honors `SET XACT_ABORT`.
RAISERROR()	`RAISERROR('Custom error: %s', 16, 1, @detail)`	• Older mechanism for raising errors; supports custom severity 1-25 and printf-style message formatting (`%s`, `%d`) • does not honor `SET XACT_ABORT`; still useful when you need a custom severity.

Table 25: JSON Functions

SQL Server stores JSON as plain nvarchar (or the native json type in 2025+) and exposes a family of functions to extract, validate, build, and shred it. Each function pairs with a specific shape — scalars use JSON_VALUE, objects and arrays use JSON_QUERY, and rowset shredding uses OPENJSON — and most accept a lax (silent NULL) or strict (raise error) path mode.

Function	Example	Description
JSON_VALUE()	`JSON_VALUE(payload, '$.customer.name')`	• Extracts a scalar value (string, number, boolean) from a JSON path • returns NULL in lax mode (default) if the path is missing or points to an object/array.
JSON_QUERY()	`JSON_QUERY(payload, '$.address')`	• Extracts a JSON object or array fragment • returns NULL for scalar values in lax mode (the opposite of JSON_VALUE).
JSON_OBJECT()	`JSON_OBJECT('id':id, 'name':name)`	• Constructs a JSON object from key:value pairs (SQL Server 2022+) • default NULL ON NULL keeps NULL keys; ABSENT ON NULL omits them.
JSON_ARRAY()	`JSON_ARRAY(1, 'two', NULL)`	• Constructs a JSON array from values (SQL Server 2022+) • default ABSENT ON NULL silently drops NULL elements (opposite of JSON_OBJECT default).
ISJSON()	`WHERE ISJSON(payload) = 1`	• Returns 1 if the string is valid JSON, 0 if not, NULL if input is NULL • 2022+ accepts a type constraint: VALUE, ARRAY, OBJECT, SCALAR.
JSON_MODIFY()	`JSON_MODIFY(payload, '$.status', 'active')`	• Returns a new JSON string with one property updated • setting NULL in lax mode deletes the key • use the `append` keyword to push onto an array.
OPENJSON()	`SELECT * FROM OPENJSON(@json)` `WITH (id INT, name VARCHAR(50))`	• Parses JSON into a rowset (SQL Server 2016+) • without WITH returns generic key/value/type columns; WITH defines typed columns and supports `AS JSON` for nested fragments.

Table 26: Query Execution Order

The order in which you write SQL clauses is not the order the database evaluates them. Understanding this logical processing order is the single biggest unlock for "why doesn't this work?" errors — it explains why WHERE cannot see SELECT aliases, why GROUP BY rejects non-aggregated columns, and why window functions cannot be filtered in WHERE.

Step	Example	Description
1. FROM / JOIN	`FROM orders o JOIN customers c ON o.customer_id = c.id`	• First step — identifies source tables and performs joins • table aliases defined here are available to every later clause.
2. WHERE	`WHERE o.status = 'shipped'`	• Filters individual rows before any grouping • cannot reference SELECT aliases, aggregate functions, or window functions.
3. GROUP BY	`GROUP BY c.country`	Collapses rows into groups for aggregate computation — every non-aggregated column in SELECT must appear here.
4. HAVING	`HAVING COUNT(o.id) > 5`	• Filters groups after aggregation • can reference aggregate functions; cannot reference SELECT aliases in standard SQL.
5. SELECT (+ window functions)	`SELECT c.country, COUNT(o.id), SUM(o.amount)`	Evaluates expressions, aliases, and window functions — aliases defined here are why WHERE/GROUP BY/HAVING can't see them.
6. QUALIFY	`QUALIFY ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) = 1`	• Filters on window function results — what HAVING is to GROUP BY, QUALIFY is to window functions • Snowflake, BigQuery, DuckDB, Databricks; not ANSI standard.
7. DISTINCT	`SELECT DISTINCT country`	Removes duplicate rows from the result set after SELECT expressions are evaluated.
8. ORDER BY	`ORDER BY SUM(o.amount) DESC`	• Sorts the final result set — runs after SELECT, so it CAN reference SELECT aliases • without it, row order is not guaranteed.
9. LIMIT / OFFSET / TOP / FETCH	`LIMIT 10 OFFSET 20`	• Restricts and paginates the already-sorted result • without ORDER BY the rows returned are unpredictable.

Table 27: Advanced Ordering

Beyond a single ORDER BY column ASC, real queries often need per-column sort directions, explicit NULL placement, custom priority orders, locale-aware comparison, or row counts that include ties. The exact syntax and defaults vary across SQL dialects, so portable code typically spells out direction, NULL handling, and tie-handling explicitly rather than relying on the database default.

Technique	Example	Description
ORDER BY ASC / DESC	`ORDER BY salary DESC, name ASC`	• Multi-column sort — secondary columns resolve ties from the primary column • ASC is the default; direction is per-column, so `ORDER BY a, b DESC` sorts `a` ASC then `b` DESC.
NULLS FIRST / NULLS LAST	`ORDER BY score DESC NULLS LAST`	• Explicit NULL positioning in the sort (ANSI standard) • defaults differ — PostgreSQL/Oracle treat NULL as larger (NULLS LAST for ASC); SQL Server, MySQL, SQLite treat NULL as smaller • SQL Server does not support the keyword — emulate with `CASE WHEN col IS NULL THEN 1 ELSE 0 END`.
ORDER BY column position	`ORDER BY 1, 2`	• Reference by SELECT-list position — 1 = first selected column • convenient for ad-hoc queries but fragile in production: reordering the SELECT list silently changes the sort.
CASE in ORDER BY	`ORDER BY CASE status WHEN 'urgent' THEN 1 WHEN 'normal' THEN 2 ELSE 3 END`	• Custom sort priority using a CASE expression as the sort key • enables business-rule ordering that doesn't follow alphabetical or numeric order.
COLLATE	`ORDER BY name COLLATE Latin1_General_CI_AI`	• Overrides the column's collation for that one sort — CI = case-insensitive, AI = accent-insensitive • useful when the column's default collation doesn't match the sort you need (e.g., locale-aware alphabetisation).
TOP WITH TIES	`SELECT TOP 3 WITH TIES name, score FROM leaderboard ORDER BY score DESC`	• Includes rows tied with the last qualifying row's ORDER BY value, so the result may exceed N • requires `ORDER BY`; SQL Server syntax — PostgreSQL uses the SQL-standard `FETCH FIRST n ROWS WITH TIES`.

Back to Databases

Next Topic: SQL for Data Analysis Cheat Sheet