Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Chroma Vector Database Cheat Sheet

Chroma Vector Database Cheat Sheet

Back to Generative AI
Updated 2026-05-21
Next Topic: Claude (Anthropic) Cheat Sheet

Chroma is an open-source, AI-native vector database designed for storing, searching, and managing embeddings alongside metadata, documents, and rich filtering. It sits at the heart of retrieval-augmented generation (RAG) pipelines, giving LLMs long-term memory and semantic search over private data. Chroma runs in-memory for rapid prototyping, on disk for local persistence, or as a remote HTTP server and fully managed Chroma Cloud for production scale β€” all under the same Python and JavaScript API. The key mental model is that a collection acts like a smart table: every record holds an ID, an embedding vector, optional metadata key-value pairs, and an optional document string, and everything from insert to nearest-neighbor search runs through that single consistent shape.

What This Cheat Sheet Covers

This topic spans 17 focused tables and 126 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Client Types and InitializationTable 2: Collection ManagementTable 3: Adding and Updating DataTable 4: Querying CollectionsTable 5: Metadata Filtering Operators (where)Table 6: Document Filtering Operators (where_document)Table 7: Embedding FunctionsTable 8: HNSW Index ConfigurationTable 9: Distance FunctionsTable 10: Chroma Cloud and Managed OfferingTable 11: Multi-TenancyTable 12: Server Deployment and AuthenticationTable 13: LangChain IntegrationTable 14: LlamaIndex IntegrationTable 15: Multimodal CollectionsTable 16: chroma-mcp (MCP Server Integration)Table 17: Gotchas, Pitfalls, and Best Practices

Table 1: Client Types and Initialization

Choosing the right Chroma client is the first decision in every project β€” it determines where data lives, whether it persists, and how many processes can share it. Each client type presents the identical collection API, so switching from development to production requires only changing the client constructor.

TypeExampleDescription
EphemeralClient
import chromadb
client = chromadb.EphemeralClient()
In-memory only client; data is lost when the process exits.
β€’ Ideal for tests and rapid prototyping
β€’ No disk I/O overhead
PersistentClient
client = chromadb.PersistentClient(
path="./chroma_db")
Writes to disk at the given path; data survives restarts. Default choice for local development.
HttpClient
client = chromadb.HttpClient(
host="localhost", port=8000)
Connects to a separately running Chroma server (HTTP); enables multi-process and multi-client access. Recommended for production self-hosting.

More in Generative AI

  • Chain-of-Thought Reasoning Cheat Sheet
  • Claude (Anthropic) Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • CrewAI (Multi-Agent Framework) Cheat Sheet
  • LlamaIndex Cheat Sheet
  • pgvector for Postgres Vector Search Cheat Sheet
View all 95 topics in Generative AI