Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Advanced RAG Patterns and Optimization Cheat Sheet

Advanced RAG Patterns and Optimization Cheat Sheet

Back to Generative AI
Updated 2026-05-19
Next Topic: AgentOps Cheat Sheet

Retrieval-Augmented Generation (RAG) grounds a language model in external documents instead of relying solely on its frozen weights, and "advanced RAG" is the engineering discipline of making every stage of that pipeline — indexing, retrieval, query handling, post-processing, and generation — measurably better than the naive embed-and-fetch-top-k baseline. It matters because most production failures are retrieval failures, not generation failures: the model answers well when given the right context and hallucinates when given the wrong one, so the highest-leverage work happens before the LLM ever sees a token. The key mental model to carry into the tables below: treat RAG as a multi-stage funnel where recall is won early (hybrid retrieval, smart chunking, query transformation) and precision is won late (reranking, compression, context ordering) — and that bigger context windows do not remove the need for sharp retrieval, they just hide the cost of imprecision until it silently degrades answers.

What This Cheat Sheet Covers

This topic spans 9 focused tables and 72 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Hybrid and Neural Retrieval FoundationsTable 2: Indexing and Chunking StrategiesTable 3: Query Transformation and RoutingTable 4: Post-Retrieval RefinementTable 5: Adaptive, Corrective and Iterative RAGTable 6: Knowledge-Structured and Memory-Augmented RAGTable 7: Context Window and Cost ManagementTable 8: Retriever Training and RAG EvaluationTable 9: RAG vs Fine-Tuning Decision Framework

Table 1: Hybrid and Neural Retrieval Foundations

The retrieval layer decides the ceiling on answer quality; combining lexical exactness with semantic understanding, and adding neural ranking, is the foundation every other advanced pattern builds on.

TechniqueExampleDescription
Dense vector retrieval
vectorstore.similarity_search(query, k=5)
• Embeds query and chunks into vectors, retrieves by cosine/dot-product similarity
• captures semantic meaning but can miss exact keywords
BM25 (sparse lexical retrieval)
BM25Retriever.from_documents(docs)
• Probabilistic term-frequency ranking
• excels at exact keyword, code, and rare-term matches that dense embeddings blur
Hybrid search (sparse + dense)
hybrid(query, alpha=0.5)
• Runs lexical and vector retrievers in parallel and merges results
• consistently beats either alone on heterogeneous corpora and is the de facto production default.
Reciprocal Rank Fusion (RRF)
\text{RRF}(d)=\sum_{r}\frac{1}{k+\text{rank}_r(d)}
• Rank-only fusion of multiple result lists (k≈60)
• merges hybrid or multi-query results without needing comparable scores.
Cross-encoder reranking
reranker.rank(query, docs)[:top_n]
• Second stage that jointly encodes query+document for a precise relevance score
• slow but high-precision, applied only to top candidates

More in Generative AI

  • AgentOps Cheat Sheet
  • AI Agents Cheat Sheet
  • Chain-of-Thought Reasoning Cheat Sheet
  • GPT Models Cheat Sheet
  • LLM Orchestration Cheat Sheet
  • Prompt Engineering Cheat Sheet
View all 95 topics in Generative AI