AI-LLM Memory & Reasoning Cheat Sheet

Updated 2026-05-28

Next Topic: AI-LLM Task Capabilities Cheat Sheet

AI memory and reasoning systems enable large language models and agents to retain information across interactions and solve complex problems through structured thought processes. Memory systems range from short-term conversation buffers to persistent knowledge graphs, while reasoning techniques guide models through step-by-step problem decomposition, verification, and refinement. In 2026, context engineering has emerged as the overarching discipline for managing what information lives in the context window at each step, and memory scaling — the property that agent performance improves as accumulated experience grows — has emerged as a new axis alongside parametric and inference-time scaling. Understanding the interplay between memory architecture, reasoning strategy, multi-agent coordination, and efficient context management is critical for building production agents that maintain context, reduce hallucinations, and execute multi-step workflows reliably.

What This Cheat Sheet Covers

This topic spans 17 focused tables and 134 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Memory Types and ArchitectureTable 2: Memory Management FrameworksTable 3: Context Window ManagementTable 4: Reasoning Techniques — Chain-BasedTable 5: Reasoning Techniques — Tree and Graph-BasedTable 6: Reasoning Techniques — Reflection and RefinementTable 7: Reasoning Techniques — SpecializedTable 8: Inference-Time Compute ScalingTable 9: Retrieval-Augmented Generation (RAG) and MemoryTable 10: Memory Retrieval StrategiesTable 11: In-Context Learning and PromptingTable 12: Agent State and Session ManagementTable 13: Multi-Agent MemoryTable 14: Vector Databases for MemoryTable 15: Memory and Reasoning Trade-offsTable 16: Memory Evaluation BenchmarksTable 17: Model Architectures for Memory Efficiency

Table 1: Memory Types and Architecture

Memory in LLM agents draws directly from cognitive science: separating episodic, semantic, and procedural stores enables selective retrieval and prevents context overload. Choosing the right type — or the right combination — determines both how much an agent remembers and how cheaply it retrieves the right fact at query time.

Type	Example	Description
Short-Term (Working) Memory	`messages = [{"role": "user", "content": "Hi"},` `{"role": "assistant", "content": "Hello"}]`	• Stores recent conversation turns within the current session • limited by context window size and discarded when the session ends.
Long-Term Memory	`vector_db.store(embedding, metadata, user_id)`	• Persists knowledge across sessions using external storage • retrieved dynamically based on relevance rather than loaded wholesale.
Episodic Memory	`{"timestamp": "2026-04-01", "event": "user asked about X",` `"outcome": "provided Y"}`	• Records specific events and interactions with temporal context • allows the agent to recall what happened when and learn from past experiences.
Semantic Memory	`knowledge_graph.add_fact("Paris", "capital_of", "France")`	• Stores general facts and relationships independent of when they were learned • declarative knowledge without time markers.
Procedural Memory	`workflow = ["analyze_input", "generate_plan", "execute_steps",` `"verify_output"]`	• Encodes learned skills, workflows, and action sequences • guides how to perform tasks rather than storing facts, often implemented as tool patterns or code.
Observation Memory	`obs = synthesize(raw_facts, level="insight")` `store.add(obs, type="observation")`	• Higher-order synthesis of raw episodic and semantic memories into generalizable insights • improves multi-hop recall by giving retrieval access to richer, more abstract representations.
Vector Memory	`results = vector_db.search(query_embedding, top_k=5)`	• Embeds memories as vectors • enables semantic similarity search to retrieve contextually relevant information from large memory stores.
Graph Memory	`graph.add_edge("user_123", "prefers", "dark_mode")`	• Represents knowledge as entities and relationships in a graph structure • captures complex associations and multi-hop reasoning paths.

Table 1: Memory Types and Architecture

Type	Example	Description
Short-Term (Working) Memory	`messages = [{"role": "user", "content": "Hi"},` `{"role": "assistant", "content": "Hello"}]`	• Stores recent conversation turns within the current session • limited by context window size and discarded when the session ends.
Long-Term Memory	`vector_db.store(embedding, metadata, user_id)`	• Persists knowledge across sessions using external storage • retrieved dynamically based on relevance rather than loaded wholesale.
Episodic Memory	`{"timestamp": "2026-04-01", "event": "user asked about X",` `"outcome": "provided Y"}`	• Records specific events and interactions with temporal context • allows the agent to recall what happened when and learn from past experiences.
Semantic Memory	`knowledge_graph.add_fact("Paris", "capital_of", "France")`	• Stores general facts and relationships independent of when they were learned • declarative knowledge without time markers.
Procedural Memory	`workflow = ["analyze_input", "generate_plan", "execute_steps",` `"verify_output"]`	• Encodes learned skills, workflows, and action sequences • guides how to perform tasks rather than storing facts, often implemented as tool patterns or code.
Observation Memory	`obs = synthesize(raw_facts, level="insight")` `store.add(obs, type="observation")`	• Higher-order synthesis of raw episodic and semantic memories into generalizable insights • improves multi-hop recall by giving retrieval access to richer, more abstract representations.
Vector Memory	`results = vector_db.search(query_embedding, top_k=5)`	• Embeds memories as vectors • enables semantic similarity search to retrieve contextually relevant information from large memory stores.
Graph Memory	`graph.add_edge("user_123", "prefers", "dark_mode")`	• Represents knowledge as entities and relationships in a graph structure • captures complex associations and multi-hop reasoning paths.

AI/LLM Memory & Reasoning Cheat Sheet

Table 1: Memory Types and Architecture

AI/LLM Memory & Reasoning Cheat Sheet

Table 1: Memory Types and Architecture