Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Great Expectations Data Quality Cheat Sheet

Great Expectations Data Quality Cheat Sheet

Back to Data Engineering
Updated 2026-05-15
Next Topic: Ibis Cheat Sheet

Great Expectations (GX) is an open-source data quality framework for Python that enables teams to validate, document, and profile data pipelines through declarative Expectations—assertions about data that can be tested automatically. It supports Pandas, Spark, and SQL backends, integrating seamlessly into orchestration tools like Airflow, dbt, and Databricks. The framework distinguishes itself through auto-generated Data Docs (human-readable validation reports), reusable Expectation Suites, and a Checkpoint-based execution model that triggers validation and post-validation actions. A key mental model: Expectations are unit tests for data—specific, versioned, and executable—designed to catch quality issues before they propagate downstream.

What This Cheat Sheet Covers

This topic spans 12 focused tables and 89 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Expectation CategoriesTable 2: Common Column-Level ExpectationsTable 3: Statistical Column ExpectationsTable 4: Table-Level ExpectationsTable 5: Multi-Column ExpectationsTable 6: Batch Request PatternsTable 7: Checkpoint ConfigurationTable 8: Action List TypesTable 9: Custom Expectation Base ClassesTable 10: Data Docs ConfigurationTable 11: Integration PatternsTable 12: Data Assistants and Profiling

Table 1: Expectation Categories

Every Expectation in GX belongs to one of these base categories, and the category determines how the assertion is evaluated—row-by-row, as a single aggregate, across the whole table, or against pairs and sets of columns. Knowing which family an Expectation falls into tells you immediately what kind of check it performs and how the mostly threshold (or lack of one) will behave.

TypeExampleDescription
Column Map Expectation
expect_column_values_to_not_be_null(column="age")
• Evaluates row-by-row condition for a single column
• returns success if mostly parameter threshold met (e.g., 95% non-null).
Column Aggregate Expectation
expect_column_mean_to_be_between(column="price", min_value=10, max_value=100)
• Computes single aggregate metric (mean, std, distinct count) for a column
• validates against min/max bounds
Table Expectation
expect_table_row_count_to_be_between(min_value=1000, max_value=50000)
• Validates dataset-level properties like row count, column presence, or column order
• operates on entire table
Column Pair Map Expectation
expect_column_pair_values_a_to_be_greater_than_b(column_A="end_date", column_B="start_date")
• Compares two columns row-by-row
• checks relationships like greater-than, equality, or set membership across paired values

More in Data Engineering

  • Google BigQuery for Data Engineering Cheat Sheet_v1_tables
  • Ibis Cheat Sheet
  • Airbyte Open-Source ELT Cheat Sheet
  • Azure Synapse Analytics Cheat Sheet
  • Data Wrangling Cheat Sheet
  • ETL (Extract, Transform, Load) Cheat Sheet
View all 61 topics in Data Engineering