Apache Cassandra Cheat Sheet

Updated 2026-05-15

Apache Cassandra is a distributed NoSQL database built for handling massive volumes of data across commodity servers while providing high availability with no single point of failure. Originally developed at Facebook and open-sourced in 2008, it uses a peer-to-peer architecture with eventual consistency, allowing linear scalability and fault tolerance across multiple datacenters. Cassandra's write-optimized storage engine leverages an LSM tree structure (Memtable → SSTable via Commit Log) that enables extremely fast writes, while its tunable consistency lets you balance between latency and data correctness per query. The key mental model: Cassandra trades strong consistency guarantees for availability and partition tolerance, so data modeling is query-first — denormalize aggressively, design tables around access patterns, and embrace the reality that joins and aggregations across partitions are expensive or impossible.

What This Cheat Sheet Covers

This topic spans 20 focused tables and 126 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Keyspace ManagementTable 2: Table Creation and Primary KeysTable 3: CQL Data TypesTable 4: User-Defined TypesTable 5: INSERT, UPDATE, and DELETE OperationsTable 6: SELECT Queries and FilteringTable 7: Consistency LevelsTable 8: Lightweight TransactionsTable 9: Indexes and Materialized ViewsTable 10: Batch StatementsTable 11: TTL and Data ExpiryTable 12: Data Modeling PatternsTable 13: Compaction StrategiesTable 14: Nodetool CommandsTable 15: Storage Engine ComponentsTable 16: Cluster ArchitectureTable 17: Write PathTable 18: Read PathTable 19: Connection and Driver ConfigurationTable 20: Advanced Features and Patterns

Table 1: Keyspace Management

A keyspace is the outermost container in Cassandra — the equivalent of a database in the relational world — and its single most important setting is how data gets replicated. These commands cover creating, altering, and dropping keyspaces, with the replication strategy and per-datacenter replication factor that decide how many copies of your data live where. Reach for NetworkTopologyStrategy in any real deployment; SimpleStrategy is strictly a development convenience.

Command	Example	Description
CREATE KEYSPACE	`CREATE KEYSPACE cycling` `WITH REPLICATION = {'class':` `'NetworkTopologyStrategy',` `'dc1': 3, 'dc2': 2};`	• Creates a top-level namespace defining replication strategy and replication factor per datacenter • NetworkTopologyStrategy is production standard
SimpleStrategy	`CREATE KEYSPACE test WITH` `REPLICATION = {'class':` `'SimpleStrategy',` `'replication_factor': 3};`	• Single-datacenter replication strategy that places replicas on consecutive nodes clockwise; not recommended for production • Use only for development or single-DC testing.
ALTER KEYSPACE	`ALTER KEYSPACE cycling WITH` `REPLICATION = {'class':` `'NetworkTopologyStrategy',` `'dc1': 4};`	• Modifies keyspace properties such as replication factor • changes take effect immediately but require nodetool repair to redistribute data

Table 1: Keyspace Management

Command	Example	Description
CREATE KEYSPACE	`CREATE KEYSPACE cycling` `WITH REPLICATION = {'class':` `'NetworkTopologyStrategy',` `'dc1': 3, 'dc2': 2};`	• Creates a top-level namespace defining replication strategy and replication factor per datacenter • NetworkTopologyStrategy is production standard
SimpleStrategy	`CREATE KEYSPACE test WITH` `REPLICATION = {'class':` `'SimpleStrategy',` `'replication_factor': 3};`	• Single-datacenter replication strategy that places replicas on consecutive nodes clockwise; not recommended for production • Use only for development or single-DC testing.
ALTER KEYSPACE	`ALTER KEYSPACE cycling WITH` `REPLICATION = {'class':` `'NetworkTopologyStrategy',` `'dc1': 4};`	• Modifies keyspace properties such as replication factor • changes take effect immediately but require nodetool repair to redistribute data