GraphRAG - Knowledge Graph Retrieval-Augmented Generation Cheat Sheet

Updated 2026-05-18

Next Topic: Hyperparameter Tuning Cheat Sheet

GraphRAG is an advanced retrieval-augmented generation paradigm that combines knowledge graphs with large language models to address the limitations of standard vector-based RAG. Unlike traditional RAG, which retrieves text chunks via semantic similarity, GraphRAG extracts a structured knowledge graph from documents—capturing entities, relationships, and communities—then uses graph traversal and community summaries to power both local (entity-focused) and global (dataset-wide) reasoning. This enables multi-hop inference, explainable provenance, and improved accuracy on complex queries where answers live in connections, not content. Key to GraphRAG's value is its two-stage architecture: an indexing pipeline that constructs the graph (entity extraction → relationship detection → community clustering → summary generation), and a retrieval pipeline that traverses or queries the graph at inference time. Trade-offs include higher indexing costs (10–100x token usage vs. vanilla RAG) and increased latency, but where relational reasoning matters—finance, healthcare, legal compliance—GraphRAG consistently outperforms embedding-only approaches by 35–46% on multi-hop benchmarks.

What This Cheat Sheet Covers

This topic spans 25 focused tables and 178 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core GraphRAG ConceptsTable 2: GraphRAG vs. Standard RAGTable 3: Entity and Relationship Extraction TechniquesTable 4: Community Detection AlgorithmsTable 5: Query Types and Retrieval PatternsTable 6: Knowledge Graph Construction from DocumentsTable 7: Graph Database OptionsTable 8: Hybrid Vector-Graph ArchitecturesTable 9: Microsoft GraphRAG ImplementationTable 10: LlamaIndex GraphRAG ImplementationTable 11: LightRAG and Alternative ImplementationsTable 12: Text Chunking Strategies for GraphRAGTable 13: Prompting Techniques for GraphRAGTable 14: Query Routing and ClassificationTable 15: Cypher Query Generation and Text2CypherTable 16: Cost Optimization StrategiesTable 17: Performance and ScalabilityTable 18: Monitoring and ObservabilityTable 19: Evaluation Metrics and BenchmarksTable 20: Use Cases and When to Use GraphRAGTable 21: Limitations and ChallengesTable 22: LLM Provider Choices for GraphRAGTable 23: Integration with LangChain and LlamaIndexTable 24: Document Parsing and PreprocessingTable 25: Agentic GraphRAG and Future Directions

Table 1: Core GraphRAG Concepts

GraphRAG fundamentally reimagines retrieval by replacing flat semantic search with structured graph reasoning. Understanding these foundational concepts clarifies why GraphRAG excels at relationship-driven queries where standard RAG fails.

Concept	Example	Description
GraphRAG	Microsoft's approach: extract entities → build hierarchy → generate summaries → query via map-reduce	• RAG paradigm that uses knowledge graphs instead of vector embeddings for retrieval • enables multi-hop reasoning and explainable answers
Knowledge Graph (KG)	Nodes = `Person`, `Organization`; Edges = `WORKS_FOR`, `INVESTED_IN`	• Structured representation of data as entities (nodes) connected by relationships (edges) • captures semantics beyond flat text
Entity Extraction	LLM extracts `"John Smith, CEO, TechCorp"` → nodes `Person(John Smith)`, `Organization(TechCorp)`, edge `ROLE_AT`	• Process of identifying named entities (people, places, concepts) from unstructured text • forms graph nodes
Relationship Extraction	From text: `"Alice hired Bob"` → triple `(Alice, HIRED, Bob)`	• Detecting semantic connections between entities • forms graph edges • can be LLM-based or NLP rule-based
Community Detection	Leiden algorithm clusters related entities into communities	• Graph clustering to group densely connected entities • enables hierarchical summarization at scale
Community Summary	LLM generates: `"Community 7 focuses on AI safety research, key members: Anthropic, OpenAI..."`	• Abstract of a detected community • generated by LLM from entities/relationships • powers global search

Table 1: Core GraphRAG Concepts

Concept	Example	Description
GraphRAG	Microsoft's approach: extract entities → build hierarchy → generate summaries → query via map-reduce	• RAG paradigm that uses knowledge graphs instead of vector embeddings for retrieval • enables multi-hop reasoning and explainable answers
Knowledge Graph (KG)	Nodes = `Person`, `Organization`; Edges = `WORKS_FOR`, `INVESTED_IN`	• Structured representation of data as entities (nodes) connected by relationships (edges) • captures semantics beyond flat text
Entity Extraction	LLM extracts `"John Smith, CEO, TechCorp"` → nodes `Person(John Smith)`, `Organization(TechCorp)`, edge `ROLE_AT`	• Process of identifying named entities (people, places, concepts) from unstructured text • forms graph nodes
Relationship Extraction	From text: `"Alice hired Bob"` → triple `(Alice, HIRED, Bob)`	• Detecting semantic connections between entities • forms graph edges • can be LLM-based or NLP rule-based
Community Detection	Leiden algorithm clusters related entities into communities	• Graph clustering to group densely connected entities • enables hierarchical summarization at scale
Community Summary	LLM generates: `"Community 7 focuses on AI safety research, key members: Anthropic, OpenAI..."`	• Abstract of a detected community • generated by LLM from entities/relationships • powers global search

GraphRAG – Knowledge Graph Retrieval-Augmented Generation Cheat Sheet

Table 1: Core GraphRAG Concepts

GraphRAG – Knowledge Graph Retrieval-Augmented Generation Cheat Sheet

Table 1: Core GraphRAG Concepts