dbt (Data Build Tool) Cheat Sheet

Updated 2026-05-28

🧠Study flashcards on this topic118 cards · spaced repetition→

dbt (data build tool) is an open-source analytics engineering framework that transforms raw data in warehouses using SQL-first workflows with software engineering best practices. It enables teams to build modular, tested, and documented data pipelines directly in platforms like Snowflake, BigQuery, Databricks, and Redshift, treating transformations as code with version control, CI/CD, and automated testing. The tool's power lies in combining SQL transformations with Jinja templating, enabling dynamic, reusable logic while maintaining complete lineage tracking from raw sources to final models and BI exposures. One key mental model: dbt doesn't move data—it builds SELECT statements that your warehouse executes, making it a transformation-only tool in the modern ELT (Extract-Load-Transform) paradigm.

What This Cheat Sheet Covers

This topic spans 18 focused tables and 167 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Project Structure and Model LayersTable 2: MaterializationsTable 3: Incremental StrategiesTable 4: Built-in TestsTable 5: CLI CommandsTable 6: Jinja and MacrosTable 7: Snapshots and SCD Type 2Table 8: Model ConfigurationsTable 9: Node Selection SyntaxTable 10: Sources and FreshnessTable 11: Packages and DependenciesTable 12: Environments and DeploymentTable 13: Advanced FeaturesTable 14: Testing and Data QualityTable 15: Performance and OptimizationTable 16: Debugging and DevelopmentTable 17: dbt Core vs dbt CloudTable 18: Adapters and Platform Support

Quick IndexSubscribe to unlock

A jump-to index of every table row in this cheat sheet.

Mind MapSubscribe to unlock

An interactive map of every table and concept in this topic.

Table 1: Project Structure and Model Layers

A well-structured dbt project follows a layered approach—staging, intermediate, and mart—where each layer has a clear purpose and a corresponding naming convention. Understanding which layer a model belongs to determines its materialization strategy, access level, and intended audience.

Component	Example	Description
dbt_project.yml	`name: analytics` `profile: prod` `models:` `analytics:` `materialized: table`	• Root configuration file defining project name, profile target, version, model paths, and default configs • required in every dbt project
profiles.yml	`analytics:` `target: dev` `outputs:` `dev:` `type: snowflake` `database: DEV_DB`	• Connection credentials stored in `~/.dbt/` • defines how dbt connects to the warehouse, specifying database, schema, and authentication
Staging models	`SELECT *` `FROM {{ source('erp', 'orders') }}` `WHERE _fivetran_deleted = FALSE`	• First transformation layer that cleans, renames, and casts raw source data • one staging model per source table; prefix with `stg_`.
Intermediate models	`SELECT` `order_id,` `SUM(line_total) AS order_total` `FROM {{ ref('stg_order_lines') }}` `GROUP BY 1`	• Purpose-built models that break complex logic into modular steps • not exposed to end users; typically ephemeral; prefix with `int_`.
Mart models	`SELECT *` `FROM {{ ref('int_orders_joined') }}` `WHERE is_valid = TRUE`	• Business-conformed models ready for BI tools • organized by domain (finance, marketing) • prefix with `fct_` (facts) or `dim_` (dimensions).
models/ directory	`models/` `staging/` `intermediate/` `marts/` `finance/`	• Folder containing all `.sql` model files • folder structure defines default schema in warehouse

Table 1: Project Structure and Model Layers

Component	Example	Description
dbt_project.yml	`name: analytics` `profile: prod` `models:` `analytics:` `materialized: table`	• Root configuration file defining project name, profile target, version, model paths, and default configs • required in every dbt project
profiles.yml	`analytics:` `target: dev` `outputs:` `dev:` `type: snowflake` `database: DEV_DB`	• Connection credentials stored in `~/.dbt/` • defines how dbt connects to the warehouse, specifying database, schema, and authentication
Staging models	`SELECT *` `FROM {{ source('erp', 'orders') }}` `WHERE _fivetran_deleted = FALSE`	• First transformation layer that cleans, renames, and casts raw source data • one staging model per source table; prefix with `stg_`.
Intermediate models	`SELECT` `order_id,` `SUM(line_total) AS order_total` `FROM {{ ref('stg_order_lines') }}` `GROUP BY 1`	• Purpose-built models that break complex logic into modular steps • not exposed to end users; typically ephemeral; prefix with `int_`.
Mart models	`SELECT *` `FROM {{ ref('int_orders_joined') }}` `WHERE is_valid = TRUE`	• Business-conformed models ready for BI tools • organized by domain (finance, marketing) • prefix with `fct_` (facts) or `dim_` (dimensions).
models/ directory	`models/` `staging/` `intermediate/` `marts/` `finance/`	• Folder containing all `.sql` model files • folder structure defines default schema in warehouse