LlamaIndex Cheat Sheet

Updated 2026-04-28

Next Topic: LLM APIs and Integration Cheat Sheet

LlamaIndex is an open-source data orchestration framework for building production-ready LLM applications, specializing in retrieval-augmented generation (RAG) and agentic document workflows. It connects private or domain-specific data to large language models through sophisticated indexing, retrieval, and query mechanisms. Unlike general-purpose orchestration frameworks, LlamaIndex prioritizes data ingestion pipelines, advanced retrieval strategies, and context engineering — making it the go-to choice when your application's success hinges on how well you retrieve and structure information before passing it to an LLM. The framework treats documents as first-class citizens, offering deep control over chunking, embedding, metadata extraction, hierarchical relationships, multi-step retrieval patterns, agentic workflows, and MCP integration — essential for knowledge-intensive production applications.

What This Cheat Sheet Covers

This topic spans 29 focused tables and 195 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Index TypesTable 2: Data Loaders and ConnectorsTable 3: Ingestion PipelineTable 4: Node Parsers and Text SplittersTable 5: LLM IntegrationsTable 6: Embedding ModelsTable 7: Query EnginesTable 8: Query PipelineTable 9: RetrieversTable 10: Response SynthesizersTable 11: Postprocessors and RerankersTable 12: Chat EnginesTable 13: Agents and ToolsTable 14: Workflows and Event-Driven ArchitectureTable 15: Vector Stores IntegrationTable 16: MultiModal RAGTable 17: Storage and PersistenceTable 18: Query TransformationsTable 19: Metadata ExtractionTable 20: Evaluation MetricsTable 21: Advanced Retrieval StrategiesTable 22: Settings and ConfigurationTable 23: Prompt CustomizationTable 24: Observability and TracingTable 25: Structured OutputTable 26: StreamingTable 27: Document ManagementTable 28: MCP IntegrationTable 29: LlamaCloud Services

Table 1: Core Index Types

The index you choose decides how your data is structured and, in turn, how it gets retrieved — a vector index for semantic similarity, a tree or summary index for whole-document summarization, a property graph for relational reasoning. Picking the right one up front is the single biggest lever on retrieval quality, so it's worth knowing what each is good at before you commit a corpus to it.

Index	Example	Description
VectorStoreIndex	`index = VectorStoreIndex.from_documents(docs)`	• Stores vector embeddings of document chunks • retrieves via similarity search (cosine, Euclidean) • most common index for semantic retrieval in RAG.
SummaryIndex	`index = SummaryIndex.from_documents(docs)`	• Stores nodes as a sequential chain with no complex structure • retrieves all nodes or filters by keywords • formerly called ListIndex.
DocumentSummaryIndex	`index = DocumentSummaryIndex.from_documents(docs)`	• Extracts a summary per document and stores it alongside nodes • retrieves by matching query to document summaries first, then fetches relevant nodes.
PropertyGraphIndex	`index = PropertyGraphIndex.from_documents(docs)`	• Creates a knowledge graph with entities and relationships • supports Cypher queries, hybrid search, and graph-based retrieval for complex relational data.

Table 1: Core Index Types

Index	Example	Description
VectorStoreIndex	`index = VectorStoreIndex.from_documents(docs)`	• Stores vector embeddings of document chunks • retrieves via similarity search (cosine, Euclidean) • most common index for semantic retrieval in RAG.
SummaryIndex	`index = SummaryIndex.from_documents(docs)`	• Stores nodes as a sequential chain with no complex structure • retrieves all nodes or filters by keywords • formerly called ListIndex.
DocumentSummaryIndex	`index = DocumentSummaryIndex.from_documents(docs)`	• Extracts a summary per document and stores it alongside nodes • retrieves by matching query to document summaries first, then fetches relevant nodes.
PropertyGraphIndex	`index = PropertyGraphIndex.from_documents(docs)`	• Creates a knowledge graph with entities and relationships • supports Cypher queries, hybrid search, and graph-based retrieval for complex relational data.