Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Advanced RAG Patterns and Optimization Cheat Sheet

Advanced RAG Patterns and Optimization Cheat Sheet

Back to Generative AI
Updated 2026-05-19
Next Topic: AgentOps Cheat Sheet

Retrieval-Augmented Generation (RAG) grounds a language model in external documents instead of relying solely on its frozen weights, and "advanced RAG" is the engineering discipline of making every stage of that pipeline β€” indexing, retrieval, query handling, post-processing, and generation β€” measurably better than the naive embed-and-fetch-top-k baseline. It matters because most production failures are retrieval failures, not generation failures: the model answers well when given the right context and hallucinates when given the wrong one, so the highest-leverage work happens before the LLM ever sees a token. The key mental model to carry into the tables below: treat RAG as a multi-stage funnel where recall is won early (hybrid retrieval, smart chunking, query transformation) and precision is won late (reranking, compression, context ordering) β€” and that bigger context windows do not remove the need for sharp retrieval, they just hide the cost of imprecision until it silently degrades answers.

What This Cheat Sheet Covers

This topic spans 9 focused tables and 72 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Hybrid and Neural Retrieval FoundationsTable 2: Indexing and Chunking StrategiesTable 3: Query Transformation and RoutingTable 4: Post-Retrieval RefinementTable 5: Adaptive, Corrective and Iterative RAGTable 6: Knowledge-Structured and Memory-Augmented RAGTable 7: Context Window and Cost ManagementTable 8: Retriever Training and RAG EvaluationTable 9: RAG vs Fine-Tuning Decision Framework

Table 1: Hybrid and Neural Retrieval Foundations

The retrieval layer decides the ceiling on answer quality; combining lexical exactness with semantic understanding, and adding neural ranking, is the foundation every other advanced pattern builds on.

TechniqueExampleDescription
Dense vector retrieval
vectorstore.similarity_search(query, k=5)
Embeds query and chunks into vectors, retrieves by cosine/dot-product similarity; captures semantic meaning but can miss exact keywords.
BM25 (sparse lexical retrieval)
BM25Retriever.from_documents(docs)
Probabilistic term-frequency ranking; excels at exact keyword, code, and rare-term matches that dense embeddings blur.
Hybrid search (sparse + dense)
hybrid(query, alpha=0.5)
Runs lexical and vector retrievers in parallel and merges results; consistently beats either alone on heterogeneous corpora and is the de facto production default.
Reciprocal Rank Fusion (RRF)
\text{RRF}(d)=\sum_{r}\frac{1}{k+\text{rank}_r(d)}
Rank-only fusion of multiple result lists (kβ‰ˆ60); merges hybrid or multi-query results without needing comparable scores.
Cross-encoder reranking
reranker.rank(query, docs)[:top_n]
Second stage that jointly encodes query+document for a precise relevance score; slow but high-precision, applied only to top candidates.

More in Generative AI

  • AgentOps Cheat Sheet
  • AI Agents Cheat Sheet
  • Amazon Bedrock Cheat Sheet
  • GPT Models Cheat Sheet
  • LLM Orchestration Cheat Sheet
  • RAG Evaluation Cheat Sheet
View all 77 topics in Generative AI