Airbyte is an open-source data integration platform that enables ELT (Extract, Load, Transform) workflows through a modular connector architecture. It provides 600+ pre-built connectors for APIs, databases, and data warehouses, alongside a Python CDK and low-code Connector Builder for custom integration. Airbyte distinguishes itself through transparent state management, flexible deployment options (Cloud vs self-hosted OSS/Enterprise), and first-class support for modern data stack integration with dbt and orchestration tools. One key insight: Airbyte's sync modes (full refresh vs incremental, append vs deduped) fundamentally determine how data flows and persists, making sync mode selection the most consequential configuration decision after connector setup.
What This Cheat Sheet Covers
This topic spans 20 focused tables and 112 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Sync Modes
| Mode | Example | Description |
|---|---|---|
SELECT * FROM table→ Replace all destination data | Retrieves all available records from source and replaces everything in destination; ideal for small tables or complete rebuilds | |
All records synced → Appended to existing data | Syncs all source records but appends rather than overwrites; creates duplicates if run multiple times | |
All records + primary key → Deduplicated destination | Combines full refresh with deduplication based on primary key; most recent record wins based on cursor field |