Ibis is a portable Python dataframe library that provides a backend-agnostic API for data manipulation and analytics across 22+ execution engines including DuckDB, BigQuery, PostgreSQL, Snowflake, Spark, Polars, ClickHouse, and Materialize. Unlike pandas which operates in-memory, Ibis uses lazy evaluation to compile Python expressions into optimized SQL or engine-native code, enabling scalable analytics on massive datasets without loading data into memory. The key insight: write transformation logic once in a pandas-like API, then execute it on any supported backend—from local DuckDB for prototyping to cloud data warehouses for production—without rewriting code. As of Ibis 12.0 (February 2026), the library supports Python 3.10+ and has expanded its write-back capabilities with upsert(), per-file output methods, and Delta Lake support.
What This Cheat Sheet Covers
This topic spans 19 focused tables and 218 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Backend Connection and Configuration
Ibis uses a thin connection object to talk to any supported backend; swapping backends requires changing only this one line. DuckDB is the default for local work, but the ibis.connect() universal method can infer the backend from a URI string, making backend migration straightforward.
| Method | Example | Description |
|---|---|---|
con = ibis.duckdb.connect()con = ibis.duckdb.connect("db.duckdb") | Connect to in-memory DuckDB or persistent file. Default backend for local development. | |
con = ibis.connect("duckdb://mydb.db")con = ibis.connect("postgres://localhost/db") | • Universal connection that infers backend from URI • simplifies multi-backend workflows | |
t = ibis.memtable({"a": [1, 2], "b": [3, 4]})t = ibis.memtable(pandas_df) | Create in-memory table from dict, pandas, or Polars DataFrame using the default backend. | |
t = con.table("customers") | • Reference an existing backend table • returns a lazy table expression without loading data | |
tables = con.list_tables()tables = con.list_tables(database="db") | • List available tables • optionally filter by database or schema | |
t = con.read_csv("data.csv")t = con.read_csv("data.csv", types={"id": "int64"}) | • Register a CSV file as a table with automatic type inference • supports type overrides | |
t = con.read_parquet("data.parquet")t = con.read_parquet("*.parquet") | • Register a Parquet file or glob pattern as a table • supports predicate pushdown |