Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Weaviate Vector Database Cheat Sheet

Weaviate Vector Database Cheat Sheet

Back to Databases
Updated 2026-05-15

Weaviate is an open-source, AI-native vector database built in Go that stores objects and vectors together, enabling hybrid search (vector + keyword) at scale. Positioned as an AI database rather than just a vector store, Weaviate integrates deeply with embedding models (OpenAI, Cohere, Hugging Face) and generative modules for RAG patterns, offering automatic vectorization and native multi-tenancy. Unlike pure vector stores, it maintains an inverted index alongside HNSW indexes as a core architectural component, supporting complex GraphQL queries with nested cross-references and aggregations. The key mental model: Weaviate treats collections as first-class citizens with schema-defined properties, where each object can have multiple named vectors for multi-modal search, and tenants are physically isolated via dedicated shards—making it particularly strong for SaaS deployments requiring data isolation, hybrid retrieval, and production-grade filtering at billion-object scale.

What This Cheat Sheet Covers

This topic spans 34 focused tables and 138 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Collection Schema ConfigurationTable 2: Property Data TypesTable 3: Vector Index TypesTable 4: Vector Quantization MethodsTable 5: Distance MetricsTable 6: Vector Search OperatorsTable 7: Hybrid Search ConfigurationTable 8: BM25 Keyword SearchTable 9: Filtering with Where ClauseTable 10: GraphQL Query TypesTable 11: GraphQL Metadata FieldsTable 12: Cross-ReferencesTable 13: Named Vectors (Multi-Vector Embeddings)Table 14: Generative Search (RAG)Table 15: RerankingTable 16: Multi-TenancyTable 17: Batch OperationsTable 18: REST API EndpointsTable 19: Authentication MethodsTable 20: Replication and ConsistencyTable 21: Deployment OptionsTable 22: Backup and RestoreTable 23: Storage TiersTable 24: Advanced Query FeaturesTable 25: Object TTL (Time-To-Live)Table 26: Model Context Protocol (MCP)Table 27: Diversity Search (MMR)Table 28: Python ClientTable 29: Collection ManagementTable 30: Query Profiling and DebuggingTable 31: Tokenization and Text AnalysisTable 32: Ref2Vec (Recommendation Vectorizer)Table 33: gRPC APITable 34: Sharding and Clustering

Table 1: Collection Schema Configuration

A collection is Weaviate's answer to a table, and almost every important decision about how it behaves is baked in at creation time. These are the top-level knobs you set when you define one—which embedding model auto-vectorizes your data, what index and distance to use, how copies are replicated and sharded for scale, whether tenants are isolated, and which generative and reranker modules ride along for RAG. Many of these are immutable afterward, so it pays to understand them before you create the collection.

PropertyExampleDescription
vectorizer
client.collections.create(
name="Article",
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai()
)
• Specifies the embedding module to auto-vectorize data
• options include text2vec-openai, text2vec-cohere, text2vec-transformers, multi2vec-clip, or none for custom vectors
vector index type
vector_index_config=wvc.config.Configure.VectorIndex.hnsw(
distance_metric=wvc.config.VectorDistances.COSINE
)
Sets the index structure — hnsw (default, fast ANN), flat (brute-force), dynamic (starts flat, converts to HNSW), or hfresh (high-churn streaming data).
replication factor
replication_config=wvc.config.Configure.replication(
factor=3
)
• Number of data copies stored across cluster nodes for high availability
• directly multiplies storage cost and improves fault tolerance
sharding config
sharding_config=wvc.config.Configure.sharding(
virtual_per_physical=128,
desired_count=2
)
• Controls horizontal partitioning
• desired_count sets target shards, virtual_per_physical enables flexible rebalancing across nodes

More in Databases

  • Vector Databases Cheat Sheet
  • Amazon DynamoDB Cheat Sheet
  • Database Categories and Types Cheat Sheet
  • DuckDB Cheat Sheet
  • Neo4j and Cypher Query Language Cheat Sheet
  • Redis Cheat Sheet
View all 42 topics in Databases