SQL (Structured Query Language) is the standard language for managing and manipulating relational databases, used across virtually all industries for data retrieval, analysis, and transformation. Whether querying a simple table or orchestrating complex multi-table joins with aggregations and window functions, SQL provides a declarative syntax where you specify what you want, not how to get it β the database engine handles optimization. Mastering SQL means understanding not just the syntax, but the logical execution order (FROM β WHERE β GROUP BY β HAVING β SELECT β ORDER BY), which fundamentally differs from how queries are written and explains many common pitfalls.
27 tables, 190 concepts. Select a concept node to jump to its table row.
Table 1: Basic Query Structure
Every SQL query is built from the same six core clauses, but they don't run in the order you write them β the engine processes FROM first and SELECT near the end, which is why aliases work in ORDER BY but not in WHERE. Knowing this logical processing order is the single most useful mental model for explaining why HAVING filters groups while WHERE filters rows, and why row-limiting clauses without ORDER BY return non-deterministic results.
| Clause | Example | Description |
|---|---|---|
SELECT name, salary FROM employees | β’ Specifies which columns to return and computes their expressions β’ logically executed near the end (after FROM, WHERE, GROUP BY, HAVING), so its aliases are not visible to those clauses. | |
FROM employees | β’ Identifies the source tables, views, or subqueries and applies any JOINs β’ logically the first clause processed, so its tables and columns are visible to every later clause. | |
WHERE salary > 50000 | β’ Filters individual rows before grouping β’ cannot reference aggregate functions or SELECT-list aliases (SELECT hasn't run yet). | |
GROUP BY department | β’ Collapses rows into one row per distinct group so aggregates like SUM and COUNT can be applied β’ every non-aggregated column in SELECT must also appear here. | |
HAVING COUNT(*) > 10 | β’ Filters groups after aggregation β’ the only clause where aggregate functions are valid filter conditions. | |
ORDER BY salary DESC | β’ Sorts the final result set β’ runs after SELECT, so it can reference SELECT-list aliases; ASC is the default direction. | |
SELECT DISTINCT country FROM customers | β’ Removes duplicate rows from the result, comparing the full tuple of selected columns β’ not per-column β every selected column counts toward uniqueness. | |
LIMIT 10TOP 10FETCH FIRST 10 ROWS ONLY | β’ Caps the number of rows returned β’ syntax varies by dialect: LIMIT in PostgreSQL, MySQL, SQLite, Snowflake; TOP in SQL Server; FETCH FIRST is the ANSI standard supported by Oracle, DB2, Snowflake, and PostgreSQL. | |
LIMIT 10 OFFSET 20 | β’ Skips a number of rows before returning results, typically for pagination β’ without ORDER BY the skipped subset is undefined, so different pages can repeat or miss rows. | |
SELECT name, RANK() OVER wFROM employeesWINDOW w AS (PARTITION BY dept ORDER BY salary DESC) | β’ Defines a named window specification that multiple window functions in the same query can reference via OVER wβ’ removes duplication when several aggregates share one partition / order. |
Table 2: DDL Statements
Data Definition Language (DDL) shapes the database itself β creating, altering, and dropping the tables, columns, and databases that hold your data. Unlike DML, several DDL statements behave very differently across engines: TRUNCATE is transactional in PostgreSQL and SQL Server but auto-commits in MySQL and Oracle, and rename syntax splits into ANSI ALTER TABLE ... RENAME TO vs SQL Server's sp_rename.
| Statement | Example | Description |
|---|---|---|
CREATE TABLE users ( id INT PRIMARY KEY, name VARCHAR(100) NOT NULL, email VARCHAR(255) UNIQUE) | Creates a new table with column definitions, data types, and constraints. | |
ALTER TABLE employees ADD COLUMN phone VARCHAR(20) | β’ Adds a new column to an existing table β’ existing rows get NULL unless a DEFAULT is specified. | |
ALTER TABLE products ALTER COLUMN price DECIMAL(12,2) | β’ Changes a column's definition (data type, constraints) β’ syntax varies ( MODIFY in MySQL, ALTER COLUMN in SQL Server / PostgreSQL). | |
ALTER TABLE users DROP COLUMN legacy_flag | β’ Permanently removes a column and its data β’ in PostgreSQL it's metadata-only, so disk space isn't reclaimed until the table is rewritten | |
ALTER TABLE customer RENAME TO customers | β’ Renames an existing table β’ syntax varies ( RENAME TO in PostgreSQL / Oracle / MySQL, sp_rename in SQL Server). | |
DROP TABLE IF EXISTS temp_staging | β’ Permanently deletes a table, all its data, indexes, and triggers β’ IF EXISTS prevents an error if already absent; CASCADE removes dependent foreign-key constraints and views. | |
TRUNCATE TABLE audit_log | β’ Removes all rows while preserving table structure β’ much faster than DELETE; does not fire row-level DELETE triggers; rollback works in PostgreSQL / SQL Server but auto-commits in MySQL / Oracle. | |
CREATE DATABASE analytics | Creates a new database on the server β a top-level container with its own schemas, users, and storage. |
Table 3: DML Statements
Data Manipulation Language (DML) statements write to data rather than to the schema: INSERT adds rows, UPDATE modifies them, DELETE removes them, and MERGE does any combination of the three in one set-based statement. Every DML statement that modifies rows can be rolled back inside an explicit transaction β and every one of them silently affects every row in the table if you forget the WHERE clause.
| Statement | Example | Description |
|---|---|---|
INSERT INTO orders (customer_id, amount, status)VALUES (42, 299.99, 'pending') | β’ Inserts one or more rows into a table using literal values β’ column list is optional, but if omitted the values are matched positionally against the table's declared column order. | |
INSERT INTO archiveSELECT * FROM orders WHERE order_date < '2025-01-01' | β’ Bulk-inserts rows from a query result; columns are paired positionally between SELECT and the target β’ standard pattern for copying, archiving, or staging data. | |
UPDATE employeesSET salary = salary * 1.1WHERE department = 'Engineering' | β’ Modifies existing rows that match the WHERE condition β’ without WHERE every row is updated β a classic production-disaster mistake. | |
UPDATE eSET e.dept_name = d.nameFROM employees eJOIN departments d ON e.dept_id = d.id | β’ Updates rows using data from a joined table β’ syntax differs across dialects: T-SQL uses UPDATE...FROM...JOIN, MySQL uses UPDATE t1 JOIN t2...SET, PostgreSQL uses UPDATE...FROM with the join condition in WHERE. | |
DELETE FROM sessionsWHERE last_active < NOW() - INTERVAL '30 days' | β’ Deletes matching rows and logs each row removal individually β’ fires row-level triggers and can be rolled back; without WHERE it empties the table (but leaves the schema intact). | |
MERGE target AS tUSING source AS s ON t.id = s.idWHEN MATCHED THEN UPDATE SET t.val = s.valWHEN NOT MATCHED THEN INSERT (id, val) VALUES (s.id, s.val)WHEN NOT MATCHED BY SOURCE THEN DELETE; | β’ Combines INSERT, UPDATE, and DELETE in one statement based on a match condition (added to PostgreSQL in v15) β’ requires a trailing semicolon in SQL Server; under concurrency, WITH (HOLDLOCK) is needed to prevent the well-known UPSERT race condition. |
Table 4: Join Types
Joins combine rows from two or more tables into a single result. Choosing the right join controls which rows survive when keys don't match on every side, and a misplaced filter or wrong join type is one of the most common sources of silently wrong query results.
| Type | Example | Description |
|---|---|---|
SELECT * FROM orders o INNER JOIN customers c ON o.customer_id = c.id | β’ Returns only rows with matching values in both tables β’ the default and most common join β’ rows with NULL keys on either side are excluded because NULL never equals NULL. | |
SELECT * FROM customers c LEFT JOIN orders o ON c.id = o.customer_id | β’ Returns all rows from the left table plus matching rows from the right β’ non-matching right rows produce NULL β’ a filter on a right-table column in WHERE silently converts the join back to an INNER JOIN β put such filters in the ON clause instead. | |
SELECT * FROM orders o RIGHT JOIN customers c ON o.customer_id = c.id | β’ Returns all rows from the right table plus matching rows from the left β’ mirror of LEFT JOIN; rarely used because flipping the table order to a LEFT JOIN is more readable and more portable. | |
SELECT * FROM customers c FULL OUTER JOIN orders o ON c.id = o.customer_id | β’ Returns all rows from both tables, with NULLs for the side that didn't match β’ not supported in MySQL β emulate by UNIONing a LEFT JOIN with a RIGHT JOIN. | |
SELECT * FROM colors CROSS JOIN sizes | β’ Produces a Cartesian product β every row from the first table paired with every row from the second (N Γ M rows) β’ takes no ON conditionβ’ the old comma syntax FROM a, b produces the same Cartesian product. | |
SELECT e.name, m.name AS manager FROM employees e JOIN employees m ON e.manager_id = m.id | β’ Joins a table to itself using two different aliases β’ useful for hierarchical data (employeeβmanager) or comparing rows within the same table β’ aliases are mandatory β without them the column references are ambiguous. | |
SELECT d.name, e.name, e.salaryFROM departments dCROSS JOIN LATERAL ( SELECT name, salary FROM employees WHERE department_id = d.id ORDER BY salary DESC LIMIT 3) e | β’ Allows the right-hand subquery to reference columns from tables earlier in the FROM clause, evaluating it once per outer row (like a correlated subquery in the FROM list) β’ enables top-N-per-group and per-row table-function calls that a plain JOIN cannot express β’ pair with LEFT JOIN LATERAL ... ON true to keep outer rows that produce no inner rows. | |
SELECT d.name, e.nameFROM departments dCROSS APPLY ( SELECT TOP 3 name FROM employees WHERE department_id = d.id ORDER BY salary DESC) e | β’ SQL Server's equivalent of LATERAL: evaluates the right-hand table expression once per left-hand row β’ CROSS APPLY drops left rows whose right expression returns no rows (like INNER JOIN) β’ OUTER APPLY keeps them with NULLs on the right (like LEFT JOIN). | |
SELECT * FROM employees NATURAL JOIN departments | β’ Automatically joins on every column with the same name in both tables and collapses those columns into one β’ risky in production β adding a same-named column to either table later silently changes the join condition β’ prefer explicit ON or USING (col). |
Table 5: Aggregate Functions
Aggregate functions collapse many rows into a single summary value over a group, and almost all of them silently ignore NULL inputs. Two traps catch even experienced users: COUNT(*) counts every row while COUNT(col) skips NULLs in col, and SUM/AVG/MIN/MAX over an all-NULL or empty group return NULL β never zero.
| Function | Example | Description |
|---|---|---|
SELECT COUNT(*) FROM ordersSELECT COUNT(DISTINCT customer_id) FROM orders | β’ COUNT(*) counts all rows including NULLsβ’ COUNT(col) skips NULLs; DISTINCT counts unique values. | |
SELECT SUM(amount) FROM payments | β’ Totals all non-NULL values in a column β’ returns NULL if all values are NULL. | |
SELECT AVG(salary) FROM employees WHERE dept = 'Sales' | β’ Arithmetic mean of non-NULL values β’ NULLs are excluded, not treated as zero. | |
SELECT MAX(order_date) FROM orders | β’ Returns the largest value in a column β’ works on dates and strings too. | |
SELECT MIN(price) FROM products WHERE active = 1 | Returns the smallest value in a column. | |
STRING_AGG(name, ', ') WITHIN GROUP (ORDER BY name) | β’ Concatenates string values from a group into a single delimited string β’ SQL Server/PostgreSQL use STRING_AGG β’ MySQL uses GROUP_CONCAT. | |
SELECT VAR(salary) FROM employeesSELECT VARIANCE(salary) FROM employees | β’ Statistical variance of the values in a group β’ VAR/VAR_SAMP (sample), VARP/VAR_POP (population) β’ name varies by database. | |
SELECT STDEV(salary) FROM employees | β’ Standard deviation of values β’ STDEV/STDDEV_SAMP (sample), STDEVP/STDDEV_POP (population) β’ name varies by database. |
Table 6: Window Functions β Ranking
The ranking family assigns a position number to each row inside an ordered partition. The differences between them are entirely about how ties and gaps are handled, so picking the wrong one quietly changes results. Every ranking function requires ORDER BY inside its OVER clause; PARTITION BY is optional and restarts the numbering for each group.
| Function | Example | Description |
|---|---|---|
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) | β’ Assigns a unique sequential integer per row within a partition β’ ties receive different numbers (broken by ORDER BY or arbitrarily). | |
RANK() OVER (ORDER BY score DESC) | β’ Assigns rank with gaps after ties β two rows tied at rank 2 are both rank 2; next row is rank 4 β’ competition-style ranking. | |
DENSE_RANK() OVER (ORDER BY score DESC) | β’ Like RANK but no gaps β two rows tied at rank 2 are both rank 2 β’ next row is rank 3. | |
NTILE(4) OVER (ORDER BY salary) | β’ Divides the ordered partition into N buckets as equal as possible β’ if rows don't divide evenly the earliest buckets get the extra rows. | |
PERCENT_RANK() OVER (ORDER BY salary) | β’ Relative rank (rank - 1) / (total rows - 1), range 0 to 1 inclusiveβ’ the first row is always 0.0. | |
CUME_DIST() OVER (ORDER BY salary) | β’ Cumulative distribution β rows with value β€ current row, divided by total rows β’ range 1/N to 1 β’ tied rows share the same value. |
Table 7: Window Functions β Aggregate & Analytical
Window functions compute a value for each row using a window of related rows, without collapsing the result the way GROUP BY does. The window is shaped by three parts of the OVER clause β PARTITION BY (which rows belong together), ORDER BY (their sequence), and the optional frame (ROWS / RANGE β which of those rows are visible from the current one). Most window-function surprises come from the frame: adding ORDER BY without an explicit frame defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which silently turns whole-partition aggregates into running ones and makes LAST_VALUE / NTH_VALUE return the wrong row.
| Function | Example | Description |
|---|---|---|
SUM(amount) OVER (PARTITION BY dept ORDER BY date) | β’ Running total within a partition when ORDER BY is presentβ’ without ORDER BY, returns the partition total beside every row. | |
AVG(salary) OVER (PARTITION BY department_id) | β’ Partition mean on every row when no ORDER BY is givenβ’ adding ORDER BY turns it into a running average. | |
COUNT(*) OVER (PARTITION BY customer_id) | β’ Returns the number of rows in the partition beside every detail row β’ useful for showing a per-group total without GROUP BY. | |
LAG(close_price, 1, 0) OVER (ORDER BY date) | β’ Returns the value offset rows before the current row in the windowβ’ the first row returns NULL unless a default is given as the third argument. | |
LEAD(close_price, 1, 0) OVER (ORDER BY date) | β’ Returns the value offset rows after the current row in the windowβ’ the last row returns NULL unless a default is given as the third argument. | |
FIRST_VALUE(salary) OVER (PARTITION BY dept ORDER BY hire_date) | β’ Returns the value at the first row of the window frame β’ the default frame starts at the partition's first row, so the result is usually what you expect. | |
LAST_VALUE(salary) OVER (PARTITION BY dept ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | β’ Returns the value at the last row of the window frame β’ the default frame stops at the current row, so an explicit ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is needed for the true partition last value. | |
NTH_VALUE(salary, 2) OVER ( PARTITION BY dept ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | β’ Returns the value at the nth row of the window frame (counting from 1) β’ returns NULL if n exceeds the frame, so expand the frame to UNBOUNDED FOLLOWING when you need access to rows past the current position. |
Table 8: Window Frame Specification
The frame clause narrows a window function's view to a subset of the partition relative to the current row, controlling exactly which rows feed running totals, moving averages, and lookups like LAST_VALUE(). The most important and most surprising rule: when ORDER BY is present but no frame is specified, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (peer-based, not row-based) β which silently fuses tied rows into a single running-sum step.
| Clause | Example | Description |
|---|---|---|
SUM(amount) OVER ( ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) | β’ Frame defined by physical row offsets β counts rows regardless of their ORDER BY values β’ preferred for fixed-size moving calculations like 7-day windows. | |
SUM(amount) OVER ( ORDER BY date RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) | β’ Frame defined by logical value ranges β all rows that are ORDER BY peers of the current row are included together β’ this is the default frame when ORDER BY is present and no frame clause is given. | |
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | Frame starts at the first row of the current partition β used for cumulative aggregations from the partition start. | |
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING | Frame extends to the last row of the current partition β required for LAST_VALUE() and NTH_VALUE() to see all subsequent rows in standard SQL. | |
ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING | β’ Frame spans N rows before to M rows after the current row β’ in ROWS mode the offset must be a non-negative integer; in RANGE mode it requires exactly one numeric or date ORDER BY column. | |
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | β’ In ROWS mode means just the current row; in RANGE/GROUPS mode it spans all peer rows with the same ORDER BY value β’ this peer behavior is what produces the running-total surprise on duplicate ordering keys. |
Table 9: Subqueries & CTEs
Subqueries nest one query inside another to compute filters, derived tables, or single values, while Common Table Expressions (CTEs) name those intermediate results to break a complex query into readable, sometimes recursive, steps. Knowing when each one short-circuits, gets correlated per row, or is materialized once is the difference between a query that scales and one that crawls.
| Technique | Example | Description |
|---|---|---|
SELECT * FROM employeesWHERE dept_id IN (SELECT id FROM departments WHERE budget > 100000) | β’ Embeds a query inside another query for filtering β’ the inner query runs once (non-correlated) or per row (correlated). | |
SELECT * FROM employees e1WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e2.dept = e1.dept) | β’ References an outer query's columns inside the subquery β’ logically executes once per outer row, though modern optimizers often decorrelate it into a join. | |
SELECT dept, avg_sal FROM(SELECT dept, AVG(salary) AS avg_sal FROM employees GROUP BY dept) t | β’ A subquery in the FROM clause β’ must have a table alias in most dialects β’ computed once and referenced like a table. | |
SELECT name FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id) | β’ Tests for row existence and stops at the first match β’ NOT EXISTS is also the safe replacement for NOT IN when the subquery may contain NULLs. | |
WITH ranked AS ( SELECT *, RANK() OVER (ORDER BY salary DESC) AS rk FROM employees)SELECT * FROM ranked WHERE rk <= 10 | β’ Named temporary result set introduced with WITH and scoped to a single statementβ’ PostgreSQL 12+ may inline (fold) a CTE unless you write AS MATERIALIZED. | |
WITH RECURSIVE org AS ( SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id FROM employees e JOIN org o ON e.manager_id = o.id)SELECT * FROM org | β’ CTE that references itself to traverse hierarchical or graph data (org charts, bills of materials) β’ requires an anchor member, UNION ALL (or UNION) with a recursive member, and a condition that eventually returns no rows. | |
SELECT name, (SELECT MAX(salary) FROM employees) AS max_sal FROM employees | β’ A subquery used as a single value (one row, one column) in SELECT, WHERE, etc.β’ returns NULL if zero rows and raises a cardinality error if it returns more than one. |
Table 10: Set Operations
Set operators combine the rows of two or more SELECT queries that return the same number of columns with compatible data types. The result uses the column names from the first query, any ORDER BY must come at the very end of the combined statement, and NULLs are treated as equal for the purposes of duplicate elimination β a deliberate exception to SQL's usual three-valued logic.
| Operation | Example | Description |
|---|---|---|
SELECT city FROM customers UNION SELECT city FROM suppliers | β’ Combines result sets and removes duplicates (same logic as DISTINCT)β’ slower than UNION ALL because of the extra deduplication step. | |
SELECT product_id FROM orders_2024 UNION ALL SELECT product_id FROM orders_2025 | β’ Combines result sets including all duplicates β’ faster than UNION β the canonical guidance is to default to UNION ALL and only use UNION when you actually need duplicates removed. | |
SELECT customer_id FROM orders_jan INTERSECT SELECT customer_id FROM orders_feb | β’ Returns rows that appear in both result sets β’ result is deduplicated; some dialects (e.g. PostgreSQL) also support INTERSECT ALL to keep duplicate matches. | |
SELECT customer_id FROM customers EXCEPT SELECT customer_id FROM orders | β’ Returns rows in the first set that do not appear in the second β’ EXCEPT is the ANSI name (PostgreSQL, SQL Server, MySQL 8.0.31+); Oracle uses MINUSβ’ result is deduplicated. |
Table 11: Advanced Grouping
GROUP BY extensions add multi-level subtotals and grand totals in a single pass over the data, replacing stacked UNION ALL queries with concise, optimizer-friendly syntax. ROLLUP gives strict hierarchical subtotals, CUBE gives every combination, and GROUPING SETS lets you cherry-pick exactly the sets you want; GROUPING() and GROUPING_ID() then let you tell super-aggregate rows apart from rows that happen to contain real NULLs.
| Extension | Example | Description |
|---|---|---|
SELECT year, quarter, SUM(sales)FROM revenueGROUP BY ROLLUP(year, quarter) | β’ Generates hierarchical subtotals β produces N+1 grouping sets for N columns (each prefix of the column list plus the grand total) β’ e.g., totals by (year, quarter), (year), and grand total in one query. | |
SELECT dept, year, SUM(revenue)FROM sales GROUP BY CUBE(dept, year) | β’ Generates every subset of the listed columns β 2^N grouping sets for N columns (the full power set) β’ includes every cross-tabulation plus the grand total. | |
GROUP BY GROUPING SETS ((dept, year), (dept), (year), ()) | β’ Explicit control over which grouping combinations to compute β no implicit hierarchy, no full crossing β’ avoids paying for CUBE combinations you don't need. | |
SELECT GROUPING(dept), dept, SUM(sal)FROM employees GROUP BY ROLLUP(dept) | β’ Returns 1 for super-aggregate rows (where the column was rolled up to NULL) and 0 for normal rows β’ the only reliable way to distinguish ROLLUP/CUBE nulls from actual NULL values in the data. | |
SELECT GROUPING_ID(dept, year), dept, year, SUM(sal)FROM employees GROUP BY ROLLUP(dept, year) | β’ Returns an integer bitmap where each bit is the GROUPING() value for that column β identifies the row's grouping level in a single valueβ’ cleaner than multiple GROUPING() calls when you need to label, sort, or filter by level. |
Table 12: PIVOT / UNPIVOT
PIVOT and UNPIVOT reshape result sets between long (one row per value) and wide (one column per category) layouts, which is the core operation behind most cross-tab reports. They are T-SQL extensions also supported in Oracle and Snowflake (with dialect-specific syntax), but PostgreSQL and MySQL have no native PIVOT β for those engines, and for any portable code, use conditional aggregation with CASE inside aggregate functions.
| Operator | Example | Description |
|---|---|---|
SELECT dept, [2023], [2024], [2025]FROM sales_dataPIVOT ( SUM(amount) FOR year IN ([2023],[2024],[2025])) AS pvt | β’ Rotates row values into columns β aggregates a measure and spreads distinct values from one column into separate columns β’ Aggregate function is always required, even when each cell maps to one row; values not in the IN list are silently dropped. | |
SELECT dept, year, amountFROM wide_tableUNPIVOT ( amount FOR year IN ([2023],[2024],[2025])) AS unpvt | β’ Reverses a pivot β turns column headers back into row values to normalise wide tables β’ In SQL Server, NULL-valued input cells produce no output row (Oracle exposes INCLUDE NULLS / EXCLUDE NULLS; T-SQL defaults to drop). | |
SELECT dept, SUM(CASE WHEN year=2024 THEN sales END) AS [2024], SUM(CASE WHEN year=2025 THEN sales END) AS [2025]FROM sales GROUP BY dept | β’ ANSI-standard, portable alternative to PIVOT using CASE inside aggregate functions β runs unchanged on SQL Server, PostgreSQL, MySQL, Oracle, SQLiteβ’ Also the standard workaround when pivoted columns are determined at runtime, paired with dynamic SQL to build the column list. |
Table 13: Conditional Logic
Conditional expressions let a query branch row-by-row inside SELECT, WHERE, ORDER BY, and UPDATE clauses without leaving SQL. CASE is the portable, ANSI-standard primitive; vendors layer terser shorthands like IIF, CHOOSE, and Oracle's DECODE on top β each with its own NULL-handling quirks worth knowing.
| Expression | Example | Description |
|---|---|---|
CASE WHEN salary > 100000 THEN 'Senior' WHEN salary > 60000 THEN 'Mid' ELSE 'Junior'END | β’ SQL's primary conditional expression β’ evaluates WHEN conditions top-down and returns the first match β later conditions are skipped β’ returns NULL if no WHEN matches and ELSE is omitted. | |
CASE status WHEN 'A' THEN 'Active' WHEN 'I' THEN 'Inactive' ELSE 'Unknown'END | β’ Equality shorthand β compares one expression to a list of values using =β’ cannot match NULL (because NULL = NULL is UNKNOWN) β use searched CASE WHEN ... IS NULL for that. | |
IIF(salary > 50000, 'High', 'Low') | β’ Ternary shorthand for a two-branch CASE (SQL Server / Access) β’ rewritten internally as CASE; nesting is capped at 10 levels, same as CASE. | |
CHOOSE(quarter, 'Q1', 'Q2', 'Q3', 'Q4') | β’ Returns the Nth item from a value list using a 1-based integer index β’ SQL Server only; an out-of-range or NULL index returns NULL (no error). | |
DECODE(status, 1, 'Active', 0, 'Inactive', 'Unknown') | β’ Oracle-proprietary equality-based CASE substitute: DECODE(expr, search, result [, ...] [, default])β’ uniquely treats two NULLs as equal β opposite of standard CASE. |
Table 14: String Functions
String functions transform, inspect, and assemble text values directly inside SQL. Their behavior β especially around NULL, trailing spaces, and which dialect supports which syntax β varies meaningfully between SQL Server, PostgreSQL, MySQL, and Oracle, so the same expression can return different results depending on the engine.
| Function | Example | Description |
|---|---|---|
CONCAT(first_name, ' ', last_name) | β’ Joins strings end-to-end β’ in SQL Server and MySQL, NULL arguments are converted to empty strings and ignoredβ’ PostgreSQL/Oracle/SQLite also support the || operator, which by contrast returns NULL if any operand is NULL. | |
CONCAT_WS(', ', city, state, country) | β’ Concatenates with a separator β first argument is the separator β’ ignores NULL arguments and does NOT emit the separator for them (so the separator only ever appears between non-NULL values); SQL Server 2017+, PostgreSQL, MySQL. | |
UPPER(last_name) | β’ Converts a string to uppercase or lowercase according to the database's locale β’ case folding is locale-sensitive β the Turkish dotted/dotless I is a classic gotcha. | |
TRIM(LEADING ' ' FROM name) | β’ Removes leading, trailing, or both spaces (or specified characters) β’ standard SQL TRIM supports LEADING / TRAILING / BOTHβ’ SQL Server gained TRIM in 2017; the LEADING / TRAILING / BOTH keywords require SQL Server 2022 with compatibility level 160. | |
SUBSTRING(email, 1, CHARINDEX('@', email) - 1) | β’ Extracts a portion of a string given a 1-based start position and optional length β’ positions start at 1 in every major dialect; Oracle's SUBSTR also accepts a negative start to count from the end of the string, but SQL Server's SUBSTRING does not. | |
LEFT(phone, 3)RIGHT(zip_code, 4) | β’ Extracts N characters from the left or right end of a string β’ a convenient shorthand for SUBSTRING. | |
WHERE LEN(description) > 500 | β’ Returns the number of characters in a string β’ SQL Server's LEN silently excludes trailing spaces β use DATALENGTH if you need the byte length including themβ’ PostgreSQL/MySQL's LENGTH counts everything. | |
REPLACE(phone, '-', '') | β’ Substitutes all occurrences (not just the first) of a substring with another string β’ in SQL Server the match honors the column's collation, which is case-insensitive by default. | |
CHARINDEX('@', email) | β’ Returns the 1-based character position of a substring, or 0 if not found (never NULL unless an argument is NULL)β’ CHARINDEX in SQL Server, INSTR in Oracle/MySQL, POSITION(substr IN str) in PostgreSQL. | |
SELECT value FROM STRING_SPLIT('a,b,c', ',') | β’ Splits a string on a single-character delimiter and returns a one-column table β’ SQL Server 2016+ β’ output order is not guaranteed unless you pass enable_ordinal => 1 (SQL Server 2022+) and ORDER BY ordinalβ’ PostgreSQL uses string_to_table / regexp_split_to_table. | |
LPAD(CAST(id AS VARCHAR), 8, '0')RPAD(name, 20, ' ') | β’ Pads a string to a specified length by prepending/appending fill characters; truncates from the opposite end if the input is already longer β’ PostgreSQL/MySQL/Oracle have these built in; SQL Server has no native LPAD β the usual workaround is RIGHT('00000000' + CAST(id AS VARCHAR), 8). | |
INITCAP('john SMITH') | β’ Capitalizes the first letter of each word and lowercases the rest, where "word" means a run of alphanumeric characters separated by non-alphanumerics β’ PostgreSQL and Oracle only β no direct equivalent in SQL Server. | |
FORMAT(amount, 'C2', 'en-US') | β’ Formats a value as a string using .NET format strings with optional culture (SQL Server only) β’ supports currency, dates, and numbers with locale awareness, but relies on the CLR and is notably slower than CONVERT on large row counts. |
Table 15: Date & Time Functions
Date and time handling varies more between dialects than almost any other area of SQL β function names, return types, and time-zone semantics all differ. The functions below cover the common needs: getting the current moment, doing arithmetic on dates, extracting parts, truncating to a unit, and formatting for display. Watch for two traps that bite even experienced developers: DATEDIFF counts boundary crossings rather than elapsed time, and NOW() in PostgreSQL returns the transaction start time, not the actual clock.
| Function | Example | Description |
|---|---|---|
SELECT GETDATE() | β’ Returns the current date and time β’ GETDATE() in SQL Server returns server local time; use GETUTCDATE() or SYSUTCDATETIME() for UTCβ’ NOW() in PostgreSQL/MySQL and CURRENT_TIMESTAMP (ANSI) work across dialects. | |
DATEADD(month, 3, hire_date)hire_date + INTERVAL '3 months' | β’ Adds a time interval to a date; pass a negative number to subtract β’ DATEADD in SQL Server, DATE_ADD in MySQL, + INTERVAL in PostgreSQL/Oracle. | |
DATEDIFF(day, start_date, end_date) | β’ Returns the number of datepart boundaries crossed between two dates β not strictly elapsed time β’ DATEDIFF(year, '2024-12-31', '2025-01-01') returns 1 even though the dates are 1 day apartβ’ SQL Server / MySQL syntax; PostgreSQL uses EXTRACT or subtraction. | |
DATEPART(year, hire_date)EXTRACT(YEAR FROM hire_date) | β’ Returns a specific date component (year, month, day, hour, etc.) as an integer β’ DATEPART is T-SQL; EXTRACT is the ANSI-standard form supported by PostgreSQL, MySQL, Oracle, and SQL Server 2022+. | |
DATE_TRUNC('month', created_at) | β’ Truncates a timestamp to a specified precision (year, quarter, month, day, hour) β ideal for grouping by month or quarter β’ PostgreSQL / Snowflake / Redshift; SQL Server 2022+ provides DATETRUNC(). | |
CAST('2025-01-15' AS DATE) | β’ Converts a string to a date/datetime value (ANSI standard, works across dialects) β’ In SQL Server, ambiguous formats like '01/02/2025' depend on session SET DATEFORMAT / language; ISO 8601 (YYYY-MM-DD) is always safe. | |
TO_CHAR(order_date, 'YYYY-MM')FORMAT(order_date, 'yyyy-MM') | β’ Formats a date as a string with a template pattern β useful for display or grouping by year-month β’ TO_CHAR in PostgreSQL/Oracle/Snowflake; FORMAT in SQL Server 2012+ and MySQL. | |
EOMONTH(GETDATE()) | β’ Returns the last day of the month for a given date (as a date, with the time component dropped)β’ Optional second argument adds/subtracts months before computing month-end β’ SQL Server 2012+ and MySQL. | |
CONVERT(VARCHAR, GETDATE(), 103) | β’ SQL Server's formats a date to a string using a numeric style code β’ Style 103 = dd/mm/yyyy, 120 = yyyy-mm-dd hh:mi:ss, 126 = ISO 8601β’ SQL Server only; other dialects use TO_CHAR or FORMAT. |
Table 16: Numeric Functions
Numeric functions handle the arithmetic, rounding, and randomness work that lives outside of plain + - * /. Dialect quirks bite hard here β what LOG(100) returns, whether ROUND rounds halves away from zero or to-even, and whether RAND() re-runs per row all change between vendors.
| Function | Example | Description |
|---|---|---|
ROUND(123.45, -1) β 120 | β’ Rounds to N decimal places β’ negative N rounds to tens, hundreds, etc. β’ SQL Server and PostgreSQL numeric break ties away from zero. | |
FLOOR(-4.7) β -5CEILING(4.2) β 5 | β’ FLOOR returns the largest integer β€ value (rounds toward ββ) β’ CEILING returns the smallest integer β₯ value (rounds toward +β). | |
ABS(balance - target) | β’ Returns absolute value &mdash β’ strips the sign β’ Can overflow on the minimum signed integer (e.g. ABS(-2147483648) in SQL Server). | |
POWER(2, 10) β 1024SQRT(144) β 12 | β’ POWER(x, y) raises x to the y-th power β’ SQRT returns the square root; negative input raises a domain error in most engines. | |
SELECT 17 % 5 β 2MOD(17, 5) β 2 | β’ Returns the remainder after division β’ % in SQL Server/PostgreSQL, MOD() in Oracle/MySQL/PostgreSQLβ’ In Postgres the result inherits the sign of the dividend. | |
LOG(EXP(1)) β 1 (SQL Server)LOG(100) β 2 (PostgreSQL) | β’ SQL Server / MySQL: LOG(x) = natural log; LOG(x, base) for arbitrary base in SQL Serverβ’ PostgreSQL: LOG(x) = base-10; use LN(x) for natural logβ’ LOG10 is always base-10. | |
SIGN(balance) β -1, 0, or 1 | β’ Returns β1, 0, or +1 for negative, zero, or positive input β’ useful for directional logic without CASE. | |
SELECT RAND() (SQL Server)SELECT RANDOM() (PostgreSQL) | β’ Returns a pseudo-random float between 0 and 1 β’ RAND in SQL Server/MySQL, RANDOM() in PostgreSQLβ’ In SQL Server, RAND() evaluates once per query β use NEWID() for per-row randomness. | |
TRUNC(3.987, 2) β 3.98TRUNC(-4.7) β -4 | β’ Chops (does not round) to N decimal places β’ Oracle/PostgreSQL use TRUNC, MySQL uses TRUNCATEβ’ Differs from FLOOR for negatives (TRUNC rounds toward zero, FLOOR toward ββ). |
Table 17: NULL Handling
SQL treats NULL as "unknown" under three-valued logic, so most operators that touch a NULL return UNKNOWN rather than TRUE or FALSE β which is why WHERE col = NULL never matches anything and why NOT IN against a subquery containing a NULL silently returns zero rows. The functions and predicates below are the safe, portable tools for testing, substituting, and aggregating NULL data across every major dialect.
| Technique | Example | Description |
|---|---|---|
WHERE phone IS NULLWHERE email IS NOT NULL | β’ Tests for NULL values β’ the only safe way β = NULL and != NULL always evaluate to UNKNOWN. | |
COALESCE(mobile, home_phone, 'N/A') | β’ Returns the first non-NULL argument (NULL only if all are NULL) β’ ANSI standard β’ accepts any number of arguments β’ short-circuits left-to-right. | |
NULLIF(denominator, 0) | β’ Returns NULL if both arguments are equal, otherwise the first argument β’ classic use prevents division-by-zero: value / NULLIF(denom, 0). | |
ISNULL(discount, 0) | β’ Two-argument NULL substitution (vendor-specific) β’ SQL Server ISNULL, MySQL IFNULL, Oracle NVLβ’ less portable than COALESCE. | |
AVG(score) (NULL ignored)COUNT(*) (NULL included) | β’ Most aggregates skip NULLs in both numerator and denominator β’ COUNT(*) counts all rows; COUNT(col) skips NULLs. | |
a LEFT JOIN b ON a.id = b.idWHERE b.id IS NULL -- anti-join | β’ NULL keys never match in join conditions β’ filtering an outer-join column in WHERE quietly converts LEFT JOIN to INNER JOINβ’ NOT IN (subquery) returns zero rows if the subquery yields any NULL β use NOT EXISTS instead. |
Table 18: Pattern Matching
SQL provides several pattern-matching predicates for filtering rows by text shape β from the simple wildcard-based LIKE to dialect-specific regex operators and full-text search. Which one you reach for depends on how complex the pattern is and which engine you're on.
| Technique | Example | Description |
|---|---|---|
WHERE name LIKE 'A%'WHERE code LIKE '_-[0-9][0-9]' | β’ % matches any sequence of characters; _ matches exactly one characterβ’ SQL Server also supports character classes [a-z]. | |
WHERE name ILIKE '%smith%' | β’ Case-insensitive LIKE β’ PostgreSQL only β’ equivalent to LIKE with LOWER(). | |
WHERE email NOT LIKE '%@example.com' | Inverts pattern match β returns rows where the value does not match the pattern. | |
REGEXP_LIKE(phone, '^\d{3}-\d{4}$') (Oracle/MySQL)phone ~ '^\d{3}-\d{4}$' (PostgreSQL) | β’ Full regular expression matching β’ REGEXP_LIKE in Oracle/MySQL, ~ in PostgreSQL, SIMILAR TO is SQL standard but limited; REGEXP_MATCH in PostgreSQL extracts matches. | |
WHERE CONTAINS(description, 'fast AND reliable') | β’ Full-text search predicate (SQL Server) β’ requires a full-text index; supports boolean operators and proximity searches. | |
WHERE path LIKE '100\%' ESCAPE '\' | Treats special LIKE metacharacters ( %, _) as literals when prefixed with the escape character. |
Table 19: Data Type Conversion
Explicit conversion functions turn one data type into another and decide what happens when the input doesn't fit. The big split is error vs NULL on failure: CAST / CONVERT / PARSE raise errors, while TRY_CAST / TRY_CONVERT / TRY_PARSE return NULL β much safer for dirty user input. Choosing the right one (and pairing column types in WHERE filters) also has real performance impact, because implicit conversions can stop SQL Server from using an index.
| Function | Example | Description |
|---|---|---|
CAST('3.14' AS DECIMAL(10,2)) | β’ ANSI-standard explicit type conversion β works in T-SQL, PostgreSQL, MySQL, Oracle, Snowflake, DuckDB β’ raises an error if the conversion fails. | |
CONVERT(VARCHAR, GETDATE(), 120) | β’ SQL Server type conversion with an optional style argument for date/number formatting (e.g., 120 = ODBC canonical yyyy-mm-dd hh:mi:ss)β’ MySQL CONVERT has no style code. | |
TRY_CAST(user_input AS INT) | β’ Safe CAST (SQL Server 2012+) β returns NULL instead of an error when the conversion fails β’ ideal for validating user input or dirty data. | |
TRY_CONVERT(DATE, date_string, 101) | β’ Safe CONVERT (SQL Server 2012+) β returns NULL on conversion failure β’ supports the same style codes as CONVERT. | |
TRY_PARSE('β¬1.234,56' AS MONEY USING 'de-DE') | β’ Culture-aware string parsing to date/time or number types β returns NULL on failure β’ relies on the .NET CLR, so slower than TRY_CAST. | |
TO_DATE('05 Dec 2000', 'DD Mon YYYY')TO_NUMBER('12,454.8', '99G999D9') | β’ PostgreSQL and Oracle format-model conversions β’ the explicit format string controls how the input is parsed. | |
PARSE('15 Janvier 2025' AS DATE USING 'fr-FR') | β’ SQL Server culture-aware string-to-date/number parse; raises an error on failure (use TRY_PARSE for the safe variant)β’ depends on the .NET CLR. | |
STR(123.45, 6, 1) returns ' 123.5' | β’ SQL Server float-to-character conversion with fixed length and decimal places β’ right-justifies with leading spaces and returns asterisks ( **) when the length is too small. |
Table 20: Constraints & Keys
Constraints enforce data integrity at the column or table level, rejecting writes that would violate the rules you declare in CREATE TABLE or ALTER TABLE. Watch the NULL handling closely β UNIQUE, CHECK, and FOREIGN KEY each treat NULL differently, and the defaults differ across SQL Server, PostgreSQL, MySQL, and Oracle.
| Constraint | Example | Description |
|---|---|---|
id INT PRIMARY KEY | β’ Uniquely identifies each row β enforces NOT NULL + UNIQUE together β’ one per table; may span multiple columns (composite PK) β’ creates a unique clustered index by default in SQL Server (override with PRIMARY KEY NONCLUSTERED); PostgreSQL has no clustered-index concept. | |
FOREIGN KEY (dept_id) REFERENCES departments(id) | Enforces referential integrity β the value must match a row in the referenced table or be NULL (FK columns are nullable unless you also add NOT NULL). | |
name VARCHAR(100) NOT NULL | β’ Rejects NULL values on INSERT and UPDATE β’ independent from DEFAULT β supplying an explicit NULL still raises an error even if a default exists. | |
email VARCHAR(255) UNIQUE | β’ Ensures all non-NULL values are distinct and creates a backing unique index β’ PostgreSQL / MySQL / Oracle allow multiple NULLs (NULL β NULL); SQL Server's basic UNIQUE allows only one NULL β use a filtered unique index ( WHERE col IS NOT NULL) for multi-NULL uniqueness. | |
salary DECIMAL CHECK (salary >= 0) | β’ Validates a Boolean expression on INSERT and UPDATE β’ NULL is treated as UNKNOWN, not FALSE β a CHECK constraint passes when the expression is TRUE or NULL, so NULLs slip through unless you also add NOT NULL. | |
created_at DATETIME DEFAULT GETDATE() | β’ Supplies a value when the column is omitted from INSERT or you write the DEFAULT keywordβ’ an explicit NULL is NOT replaced by the default β it goes in as NULL (or errors if the column is NOT NULL). | |
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE | β’ Automatically deletes child rows when the referenced parent row is deleted β’ alternatives: SET NULL nullifies the child FK column, SET DEFAULT uses the child's default, NO ACTION / RESTRICT block the parent delete. |
Table 21: Indexes
Indexes are the single biggest performance lever in SQL: they turn full table scans into fast lookups by keeping a sorted, pointer-based structure (usually a B-tree) over chosen columns. They are pure redundancy though β every INSERT, UPDATE, or DELETE that touches an indexed column also rewrites the index β so over-indexing is just as damaging as missing indexes, and column order in multi-column indexes is rarely a free choice.
| Concept | Example | Description |
|---|---|---|
CREATE INDEX idx_last_name ON customers (last_name) | Creates a non-unique index (B-tree by default) to speed up queries that filter, sort, or join on the column(s). Indexes are redundant copies β they cost storage and slow writes. | |
CREATE UNIQUE INDEX idx_email ON users (email) | β’ Enforces uniqueness on the indexed column(s) while accelerating lookups β’ functionally equivalent to a UNIQUE constraint (which is implemented as a unique index behind the scenes). | |
CREATE CLUSTERED INDEX idx_order_id ON orders (order_id) | β’ Physically orders the table rows by the index key β the table data IS the clustered index's leaf level β’ only one per table; in SQL Server, PRIMARY KEY creates a clustered index by default. PostgreSQL has no clustered-index concept (only a one-time CLUSTER reorder command). | |
CREATE NONCLUSTERED INDEX idx_dept ON employees (department_id) | β’ Maintains a separate structure with key values plus row locators back to the heap or clustered index β’ multiple allowed per table; fast for selective lookups but each non-covered query needs a key lookup. | |
CREATE INDEX idx_name_dept ON employees (last_name, department_id) | β’ Index on multiple columns in a defined order β usable only when the query filters on a leftmost prefix of those columns β’ an index on (A, B) helps WHERE A=β¦ and WHERE A=β¦ AND B=β¦, but not WHERE B=β¦ alone. | |
CREATE INDEX idx_cust ON orders (customer_id)INCLUDE (amount, order_date) | β’ Adds non-key columns to the index leaf via INCLUDE (SQL Server, PostgreSQL 11+)β’ query is satisfied entirely from the index with no key lookup back to the table β a big read-path win for hot queries. | |
CREATE INDEX idx_active ON users (email)WHERE active = 1 | β’ Indexes only the rows matching a WHERE predicate ("filtered index" in SQL Server, "partial index" in PostgreSQL)β’ smaller, cheaper to maintain, and useful when queries usually target a known subset. | |
DROP INDEX idx_last_name ON customers | β’ Removes an index and frees its storage β’ syntax varies: SQL Server uses DROP INDEX name ON table; PostgreSQL and MySQL use DROP INDEX name (MySQL also supports ALTER TABLE β¦ DROP INDEX). Cannot drop indexes backing a PRIMARY KEY or UNIQUE constraint β drop the constraint instead. |
Table 22: Views
Views package a SELECT statement under a name so it can be queried like a table. Standard views store no data β the underlying query re-runs on every reference, which is cheap on storage but offers no speed-up over the raw query. Materialized views flip that trade-off by persisting the result on disk for fast reads at the cost of staleness until the next refresh.
| Technique | Example | Description |
|---|---|---|
CREATE VIEW active_customers ASSELECT * FROM customers WHERE active = 1 | β’ Creates a named, saved query that can be queried like a table β’ not physically materialized β the defining query is re-run on every reference, so storage is zero but there is no performance gain over the underlying SELECT. | |
CREATE OR REPLACE VIEW monthly_sales ASSELECT EXTRACT(MONTH FROM order_date) AS month, SUM(amount) FROM orders GROUP BY 1 | β’ Atomically replaces an existing view's defining query without dropping it β’ PostgreSQL, MySQL, Oracle β’ the new query must produce the same column names, order, and types β extra columns may only be appended at the end β’ SQL Server uses ALTER VIEW or CREATE OR ALTER VIEW (2016 SP1+) instead. | |
CREATE VIEW eng_staff ASSELECT * FROM employees WHERE dept = 'Engineering'WITH CHECK OPTION | β’ Rejects INSERT, UPDATE, or MERGE through the view if the resulting row would no longer satisfy the view's WHERE clause β’ LOCAL checks only the current view; CASCADED (the default if neither is specified) also checks every underlying view. | |
CREATE MATERIALIZED VIEW report_summary ASSELECT dept, SUM(salary) FROM employees GROUP BY dept;REFRESH MATERIALIZED VIEW CONCURRENTLY report_summary; | β’ Stores query results physically β SELECTs are fast but data is stale until the next REFRESH MATERIALIZED VIEWβ’ a plain refresh takes an exclusive lock; CONCURRENTLY (PG 9.4+) avoids blocking readers but requires a UNIQUE index on the viewβ’ PostgreSQL and Oracle; SQL Server's equivalent is an indexed view (auto-maintained on every base-table change, no manual refresh). | |
DROP VIEW IF EXISTS active_customers | β’ Removes only the view definition β underlying tables and their rows are unaffected β’ RESTRICT (the default) blocks the drop if other objects depend on the view; CASCADE drops them too. |
Table 23: Transactions
Transactions group one or more SQL statements into an atomic unit so the database can guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). The statements below mark transaction boundaries, set the visibility rules between concurrent sessions, and (in the case of WITH (NOLOCK)) trade correctness for raw read speed in SQL Server.
| Statement | Example | Description |
|---|---|---|
BEGIN TRANSACTION | β’ Starts an explicit transaction block β subsequent changes are held until COMMIT or ROLLBACK β’ BEGIN / START TRANSACTION in PostgreSQL and MySQL; most engines default to autocommit (each statement is its own transaction) unless wrapped. | |
COMMIT | β’ Permanently saves all changes made since the transaction started and frees the transaction's locks and resources β’ cannot be undone β once committed, ROLLBACK no longer applies. | |
ROLLBACK | β’ Discards all changes made since BEGIN (or since a named SAVEPOINT, with ROLLBACK TO)β’ used for error recovery and to abort transactions cleanly. | |
SAVEPOINT before_updateROLLBACK TO before_update | β’ Creates a named checkpoint inside a transaction β’ ROLLBACK TO savepoint undoes work back to that point without ending the transaction β the outer transaction stays open and can still COMMIT. | |
SET TRANSACTION ISOLATION LEVEL READ COMMITTED | β’ Controls which concurrency anomalies are allowed β dirty reads, non-repeatable reads, phantom reads β’ Levels (ascending strictness): READ UNCOMMITTED β READ COMMITTED β REPEATABLE READ β SERIALIZABLE; SQL Server adds SNAPSHOT. Defaults vary: READ COMMITTED in SQL Server / PostgreSQL / Oracle, REPEATABLE READ in MySQL InnoDB. | |
SELECT * FROM orders WITH (NOLOCK) | β’ SQL Server table hint equivalent to READ UNCOMMITTED for that table β reads without acquiring shared locks β’ can return uncommitted (dirty) data, duplicate rows, or miss previously-committed rows; a notorious anti-pattern when used as a "go-fast" switch. |
Table 24: Error Handling
SQL Server's structured error handling centers on TRY...CATCH: any runtime error of severity above 10 inside the TRY block jumps execution to the matching CATCH block, where a family of ERROR_* functions exposes the message, number, severity, line, and procedure of the original error. Inside CATCH you also check XACT_STATE() to decide whether to commit or roll back, and re-surface errors with THROW (preferred since SQL Server 2012) or the older RAISERROR.
| Construct | Example | Description |
|---|---|---|
BEGIN TRY INSERT INTO orders VALUES (1, 99.99)END TRYBEGIN CATCH SELECT ERROR_MESSAGE(), ERROR_NUMBER()END CATCH | β’ Structured error handling (SQL Server / Azure SQL) β’ any runtime error of severity > 10 in TRY transfers control to CATCH; compile and name-resolution errors at the same scope are NOT caught. | |
SELECT ERROR_MESSAGE() | β’ Returns the error message text of the error that triggered the CATCH block β’ returns NULL outside CATCH | |
SELECT ERROR_NUMBER() | β’ Returns the error number β correlates to sys.messages for system errorsβ’ user-defined errors use 50000+ β’ unlike @@ERROR, the value persists across the whole CATCH scope. | |
IF XACT_STATE() = -1 ROLLBACKIF XACT_STATE() = 1 COMMIT | β’ Returns 1 (active committable), -1 (active uncommittable β must rollback), or 0 (no active transaction) β’ essential check in CATCH blocks before deciding to COMMIT or ROLLBACK. | |
THROW 50001, 'Invalid customer ID', 1-- or to re-throw:THROW | β’ Raises a user-defined error (number β₯ 50000, severity always 16) or re-throws the original error with its original details (bare THROW in CATCH)β’ SQL Server 2012+; preferred over RAISERROR for new code; honors SET XACT_ABORT. | |
RAISERROR('Custom error: %s', 16, 1, @detail) | β’ Older mechanism for raising errors; supports custom severity 1-25 and printf-style message formatting ( %s, %d)β’ does not honor SET XACT_ABORT; still useful when you need a custom severity. |
Table 25: JSON Functions
SQL Server stores JSON as plain nvarchar (or the native json type in 2025+) and exposes a family of functions to extract, validate, build, and shred it. Each function pairs with a specific shape β scalars use JSON_VALUE, objects and arrays use JSON_QUERY, and rowset shredding uses OPENJSON β and most accept a lax (silent NULL) or strict (raise error) path mode.
| Function | Example | Description |
|---|---|---|
JSON_VALUE(payload, '$.customer.name') | β’ Extracts a scalar value (string, number, boolean) from a JSON path β’ returns NULL in lax mode (default) if the path is missing or points to an object/array. | |
JSON_QUERY(payload, '$.address') | β’ Extracts a JSON object or array fragment β’ returns NULL for scalar values in lax mode (the opposite of JSON_VALUE). | |
JSON_OBJECT('id':id, 'name':name) | β’ Constructs a JSON object from key:value pairs (SQL Server 2022+) β’ default NULL ON NULL keeps NULL keys; ABSENT ON NULL omits them. | |
JSON_ARRAY(1, 'two', NULL) | β’ Constructs a JSON array from values (SQL Server 2022+) β’ default ABSENT ON NULL silently drops NULL elements (opposite of JSON_OBJECT default). | |
WHERE ISJSON(payload) = 1 | β’ Returns 1 if the string is valid JSON, 0 if not, NULL if input is NULL β’ 2022+ accepts a type constraint: VALUE, ARRAY, OBJECT, SCALAR. | |
JSON_MODIFY(payload, '$.status', 'active') | β’ Returns a new JSON string with one property updated β’ setting NULL in lax mode deletes the key β’ use the append keyword to push onto an array. | |
SELECT * FROM OPENJSON(@json) WITH (id INT, name VARCHAR(50)) | β’ Parses JSON into a rowset (SQL Server 2016+) β’ without WITH returns generic key/value/type columns; WITH defines typed columns and supports AS JSON for nested fragments. |
Table 26: Query Execution Order
The order in which you write SQL clauses is not the order the database evaluates them. Understanding this logical processing order is the single biggest unlock for "why doesn't this work?" errors β it explains why WHERE cannot see SELECT aliases, why GROUP BY rejects non-aggregated columns, and why window functions cannot be filtered in WHERE.
| Step | Example | Description |
|---|---|---|
FROM orders o JOIN customers c ON o.customer_id = c.id | β’ First step β identifies source tables and performs joins β’ table aliases defined here are available to every later clause. | |
WHERE o.status = 'shipped' | β’ Filters individual rows before any grouping β’ cannot reference SELECT aliases, aggregate functions, or window functions. | |
GROUP BY c.country | Collapses rows into groups for aggregate computation β every non-aggregated column in SELECT must appear here. | |
HAVING COUNT(o.id) > 5 | β’ Filters groups after aggregation β’ can reference aggregate functions; cannot reference SELECT aliases in standard SQL. | |
SELECT c.country, COUNT(o.id), SUM(o.amount) | Evaluates expressions, aliases, and window functions β aliases defined here are why WHERE/GROUP BY/HAVING can't see them. | |
QUALIFY ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) = 1 | β’ Filters on window function results β what HAVING is to GROUP BY, QUALIFY is to window functions β’ Snowflake, BigQuery, DuckDB, Databricks; not ANSI standard. | |
SELECT DISTINCT country | Removes duplicate rows from the result set after SELECT expressions are evaluated. | |
ORDER BY SUM(o.amount) DESC | β’ Sorts the final result set β runs after SELECT, so it CAN reference SELECT aliases β’ without it, row order is not guaranteed. | |
LIMIT 10 OFFSET 20 | β’ Restricts and paginates the already-sorted result β’ without ORDER BY the rows returned are unpredictable. |
Table 27: Advanced Ordering
Beyond a single ORDER BY column ASC, real queries often need per-column sort directions, explicit NULL placement, custom priority orders, locale-aware comparison, or row counts that include ties. The exact syntax and defaults vary across SQL dialects, so portable code typically spells out direction, NULL handling, and tie-handling explicitly rather than relying on the database default.
| Technique | Example | Description |
|---|---|---|
ORDER BY salary DESC, name ASC | β’ Multi-column sort β secondary columns resolve ties from the primary column β’ ASC is the default; direction is per-column, so ORDER BY a, b DESC sorts a ASC then b DESC. | |
ORDER BY score DESC NULLS LAST | β’ Explicit NULL positioning in the sort (ANSI standard) β’ defaults differ β PostgreSQL/Oracle treat NULL as larger (NULLS LAST for ASC); SQL Server, MySQL, SQLite treat NULL as smaller β’ SQL Server does not support the keyword β emulate with CASE WHEN col IS NULL THEN 1 ELSE 0 END. | |
ORDER BY 1, 2 | β’ Reference by SELECT-list position β 1 = first selected column β’ convenient for ad-hoc queries but fragile in production: reordering the SELECT list silently changes the sort. | |
ORDER BY CASE status WHEN 'urgent' THEN 1 WHEN 'normal' THEN 2 ELSE 3 END | β’ Custom sort priority using a CASE expression as the sort key β’ enables business-rule ordering that doesn't follow alphabetical or numeric order. | |
ORDER BY name COLLATE Latin1_General_CI_AI | β’ Overrides the column's collation for that one sort β CI = case-insensitive, AI = accent-insensitive β’ useful when the column's default collation doesn't match the sort you need (e.g., locale-aware alphabetisation). | |
SELECT TOP 3 WITH TIES name, score FROM leaderboard ORDER BY score DESC | β’ Includes rows tied with the last qualifying row's ORDER BY value, so the result may exceed N β’ requires ORDER BY; SQL Server syntax β PostgreSQL uses the SQL-standard FETCH FIRST n ROWS WITH TIES. |