Ibis is a portable Python dataframe library that provides a backend-agnostic API for data manipulation and analytics across 20+ execution engines including DuckDB, BigQuery, PostgreSQL, Snowflake, Spark, and Polars. Unlike pandas which operates in-memory, Ibis uses lazy evaluation to compile Python expressions into optimized SQL or engine-native code, enabling scalable analytics on massive datasets without loading data into memory. The key insight: write transformation logic once in a pandas-like API, then execute it on any supported backend—from local DuckDB for prototyping to cloud data warehouses for production—without rewriting code.
What This Cheat Sheet Covers
This topic spans 17 focused tables and 140 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Backend Connection and Configuration
| Method | Example | Description |
|---|---|---|
con = ibis.duckdb.connect()con = ibis.duckdb.connect("db.duckdb") | • Connect to in-memory DuckDB or persistent database file • DuckDB is the default backend for local development. | |
con = ibis.bigquery.connect( project_id="my-project") | • Connect to Google BigQuery • requires project_id and authentication via service account or user credentials. | |
con = ibis.postgres.connect( host="localhost", database="mydb") | • Connect to PostgreSQL database • supports standard connection parameters including host, port, user, password, and database. | |
con = ibis.snowflake.connect( account="xy12345", user="name") | • Connect to Snowflake data warehouse using account identifier and authentication • supports SSO and key-pair authentication. | |
con = ibis.connect("duckdb://mydb.db")con = ibis.connect("postgres://localhost/db") | • Universal connection method that infers backend from connection string URI • simplifies multi-backend workflows. |