Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

AI/LLM Memory & Reasoning Cheat Sheet

AI/LLM Memory & Reasoning Cheat Sheet

Back to Generative AI
Updated 2026-05-28
Next Topic: AI-LLM Task Capabilities Cheat Sheet

AI memory and reasoning systems enable large language models and agents to retain information across interactions and solve complex problems through structured thought processes. Memory systems range from short-term conversation buffers to persistent knowledge graphs, while reasoning techniques guide models through step-by-step problem decomposition, verification, and refinement. In 2026, context engineering has emerged as the overarching discipline for managing what information lives in the context window at each step, and memory scaling — the property that agent performance improves as accumulated experience grows — has emerged as a new axis alongside parametric and inference-time scaling. Understanding the interplay between memory architecture, reasoning strategy, multi-agent coordination, and efficient context management is critical for building production agents that maintain context, reduce hallucinations, and execute multi-step workflows reliably.

What This Cheat Sheet Covers

This topic spans 17 focused tables and 134 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Memory Types and ArchitectureTable 2: Memory Management FrameworksTable 3: Context Window ManagementTable 4: Reasoning Techniques — Chain-BasedTable 5: Reasoning Techniques — Tree and Graph-BasedTable 6: Reasoning Techniques — Reflection and RefinementTable 7: Reasoning Techniques — SpecializedTable 8: Inference-Time Compute ScalingTable 9: Retrieval-Augmented Generation (RAG) and MemoryTable 10: Memory Retrieval StrategiesTable 11: In-Context Learning and PromptingTable 12: Agent State and Session ManagementTable 13: Multi-Agent MemoryTable 14: Vector Databases for MemoryTable 15: Memory and Reasoning Trade-offsTable 16: Memory Evaluation BenchmarksTable 17: Model Architectures for Memory Efficiency

Table 1: Memory Types and Architecture

Memory in LLM agents draws directly from cognitive science: separating episodic, semantic, and procedural stores enables selective retrieval and prevents context overload. Choosing the right type — or the right combination — determines both how much an agent remembers and how cheaply it retrieves the right fact at query time.

TypeExampleDescription
Short-Term (Working) Memory
messages = [{"role": "user", "content": "Hi"},
{"role": "assistant", "content": "Hello"}]
• Stores recent conversation turns within the current session
• limited by context window size and discarded when the session ends.
Long-Term Memory
vector_db.store(embedding, metadata, user_id)
• Persists knowledge across sessions using external storage
• retrieved dynamically based on relevance rather than loaded wholesale.
Episodic Memory
{"timestamp": "2026-04-01", "event": "user asked about X",
"outcome": "provided Y"}
• Records specific events and interactions with temporal context
• allows the agent to recall what happened when and learn from past experiences.
Semantic Memory
knowledge_graph.add_fact("Paris", "capital_of", "France")
• Stores general facts and relationships independent of when they were learned
• declarative knowledge without time markers.
Procedural Memory
workflow = ["analyze_input", "generate_plan", "execute_steps",
"verify_output"]
• Encodes learned skills, workflows, and action sequences
• guides how to perform tasks rather than storing facts, often implemented as tool patterns or code.
Observation Memory
obs = synthesize(raw_facts, level="insight")
store.add(obs, type="observation")
• Higher-order synthesis of raw episodic and semantic memories into generalizable insights
• improves multi-hop recall by giving retrieval access to richer, more abstract representations.
Vector Memory
results = vector_db.search(query_embedding, top_k=5)
• Embeds memories as vectors
• enables semantic similarity search to retrieve contextually relevant information from large memory stores.
Graph Memory
graph.add_edge("user_123", "prefers", "dark_mode")
• Represents knowledge as entities and relationships in a graph structure
• captures complex associations and multi-hop reasoning paths.

More in Generative AI

  • AI-LLM Hallucination Prevention Cheat Sheet
  • AI-LLM Task Capabilities Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • CrewAI (Multi-Agent Framework) Cheat Sheet
  • LlamaIndex Cheat Sheet
  • pgvector for Postgres Vector Search Cheat Sheet
View all 95 topics in Generative AI