Milvus is an open-source, cloud-native vector database built by Zilliz that stores, indexes, and searches high-dimensional embedding vectors at billion-scale. It powers AI applications β RAG pipelines, semantic search, recommendation systems, and multimodal retrieval β by turning similarity search into a first-class database operation. Unlike bolted-on vector extensions, Milvus separates compute from storage and uses a message-queue-backed write path (Pulsar/Kafka) so search nodes scale independently of data nodes. The key mental model: everything flows through collections β shards β segments, and the right index type plus the right consistency level unlock both accuracy and throughput simultaneously.
What This Cheat Sheet Covers
This topic spans 17 focused tables and 112 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Data Model β Collections, Partitions, Shards, and Segments
A Milvus collection is the top-level container for vectors and their associated scalar fields β analogous to a table in a relational database. Understanding how collections subdivide into partitions, shards, and segments is essential before tuning performance or planning capacity.
| Concept | Example | Description |
|---|---|---|
client.create_collection(collection_name="docs", dimension=768) | Top-level data container holding a schema (fields + vector field), up to 65,535 collections per instance. | |
num_shards=2 in create_collection() | Horizontal unit for write scaling; primary key is hashed to route inserts across shards; 1β2 shards per 50β200M entities is the recommended range; immutable after creation. | |
client.create_partition(collection_name="docs", partition_name="2024") | Logical read-time subdivision within a collection; queries can skip irrelevant partitions entirely to reduce search footprint; up to 1,024 partitions per collection. | |
schema.add_field("tenant_id", DataType.VARCHAR, is_partition_key=True) | Designates a scalar field so Milvus auto-manages partitions by hashing field values; ideal for multi-tenancy with millions of tenants; eliminates manual partition management. | |
(internal β not user-created) | Smallest execution unit; intersection of shard and partition; Growing (buffering writes, unindexed) or Sealed (immutable, indexed); default max size ~122 MB before sealing. |