Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

AI Agents Cheat Sheet

AI Agents Cheat Sheet

Tables
Back to Generative AI
Updated 2026-04-05
Next Topic: AI Audio and Music Generation Cheat Sheet

AI agents are autonomous systems built on large language models that can perceive their environment, reason through complex tasks, and take actions using external tools to achieve goals. Unlike traditional chatbots that simply respond to queries, agents operate through continuous think-act-observe loops, dynamically planning their next steps based on outcomes. The defining characteristic is tool use—agents don't just generate text; they execute functions, query databases, call APIs, and coordinate with other agents through standardized protocols like MCP, A2A, and AG-UI. This shift from prediction to execution makes agents the foundation of agentic AI, transforming LLMs from assistants into operational systems capable of multi-step workflows, self-correction, and long-horizon task completion. Understanding agent architecture—perception, reasoning engines, memory systems, and orchestration patterns—is essential for building reliable production agents in 2026. Frameworks like LangGraph, CrewAI, OpenAI Agents SDK, Google ADK, and PydanticAI provide production-ready primitives, while evaluation tools like DeepEval and LLM-as-Judge enable systematic quality measurement.

Quick Index135 entries · 19 tables
Mind Map

19 tables, 135 concepts. Select a concept node to jump to its table row.

Preparing mind map...

Table 1: Core Agent Concepts

ConceptExampleDescription
Agent Loop
while not done:
thought = think(observation)
action = decide(thought)
observation = execute(action)
• Continuous think-act-observe cycle where agent reasons about current state, takes action, observes result, then repeats
• fundamental execution pattern for all agents.
Tool Use
tools = [calculator, web_search, db_query]
agent.invoke("What's 2^16?", tools)
• Agent invokes external functions or APIs to perform actions beyond text generation
• requires structured tool definitions with JSON schemas describing parameters and expected outputs.
Function Calling
{"name": "get_weather", "args": {"city": "NYC"}}
• LLM generates structured function invocation by selecting appropriate tool and filling parameters
• model returns JSON that agent runtime executes, not the result itself.
Agentic Workflow
Goal → Plan → Execute → Reflect → Retry
• Goal-driven execution pattern where agent decomposes tasks, takes actions, evaluates outcomes, and adjusts strategy
• contrasts with linear prompt chains.
Perception
raw_input = {"text": query, "context": session}
structured_data = parse(raw_input)
• Transforms raw inputs (text, API calls, sensor data) into structured representations the reasoning engine processes
• includes parsing, normalization, and context extraction.
Reasoning Engine
LLM + planning algorithm + memory
• Core decision-making component that analyzes inputs, selects actions, and generates plans
• typically an LLM enhanced with prompting strategies like ReAct or chain-of-thought.
Action Space
["send_email", "query_db", "http_get", "file_write"]
• Set of available tools and operations agent can execute
• defines boundaries of what agent can accomplish in its environment.
Autonomy
Agent decides when to stop vs. asks user for next step
• Degree to which agent makes decisions without human intervention
• ranges from fully autonomous to human-in-the-loop requiring approval per action.

Table 2: Architecture Patterns

PatternExampleDescription
ReAct
Thought: Need price data
Action: query_db("SELECT price")
Observation: $42
Thought: Now calculate...
• Reasoning and Acting interleaved
• agent generates Thought explaining reasoning, executes Action, receives Observation of result, then generates next Thought
• think-act-observe pattern improves reliability and debuggability.
Reflection
output = generate()
critique = evaluate(output, criteria)
if not good: revise(critique)
• Agent self-evaluates outputs against quality criteria, then iterates to improve
• generates, critiques, and refines until meeting standards or max attempts.
Evaluator-Optimizer
output = generator.propose(task)
feedback = evaluator.critique(output, criteria)
if not done: generator.revise(feedback)
• Generator produces output, evaluator critiques against criteria, loop repeats until quality threshold met
• enables autonomous quality assurance via two cooperating agents instead of one.
Planning
Goal → Subtasks → Order → Execute
• Agent decomposes complex goals into actionable subtasks, determines execution order, then coordinates completion
• enables long-horizon task solving.
Chain-of-Thought
Let's think step-by-step:
1. Parse question
2. Identify data needed
3. Calculate result
• Prompts agent to show intermediate reasoning steps
• improves accuracy on complex tasks by making thought process explicit before answering.
Router
if query_type == "sql":
route_to(sql_agent)
else:
route_to(general_agent)
• Conditional dispatcher that directs requests to specialized agents or tools based on content analysis
• enables modular, expert-based architectures.
Orchestrator-Worker
supervisor.assign(task, worker_agents)
results = await all_workers()
supervisor.synthesize(results)
• Hierarchical delegation where supervisor breaks work into parallel subtasks, assigns to worker agents, aggregates results
• scales complex workflows efficiently.
Agent Handoff
handoff(source=triage_agent,
target=billing_agent,
condition="billing question")
• One agent transfers control to another specialized agent mid-conversation
• enables seamless delegation where the receiving agent continues with full context.
Plan-and-Execute
plan = planner.create(goal)
for step in plan:
execute(step)
• Separates planning from execution
• planner creates full task breakdown upfront, then executor runs each step
• clearer than pure ReAct for multi-step workflows.
Tree-of-Thought
Explore multiple reasoning paths in tree structure, backtrack if needed
• Agent explores multiple solution paths simultaneously, evaluating each branch before committing
• useful when single linear reasoning path may fail.

Table 3: Memory Systems

TypeExampleDescription
Short-Term Memory
Current conversation context in prompt
• Working context for active task
• contents of current LLM context window including system prompt, recent messages, and intermediate outputs.
Long-Term Memory
Vector DB storing past interactions
• Persistent storage across sessions
• agent retrieves relevant historical context to inform current decisions
• critical for maintaining user preferences and learning.
Zep/Graphiti
graphiti.add_episode(messages)
results = graphiti.search("user prefs",
center_node_uuid)
• Temporal knowledge graph memory storing facts with validity windows
• outperforms vector-only approaches on cross-session temporal reasoning (63.8% vs 49% on LongMemEval).
Episodic Memory
"On 2026-01-15, user preferred JSON output"
• Stores specific past events and experiences with temporal context
• enables agent to recall "what happened when" for situational awareness.
Semantic Memory
"User always wants reports in PDF format"
• Stores facts, preferences, and general knowledge learned over time
• extracted and consolidated from episodic traces into persistent rules.
Procedural Memory
Learned workflows like "how to file a bug report"
• Encodes how to perform tasks
• captures successful action sequences as reusable procedures the agent can invoke.
Letta (MemGPT)
agent = create_agent(model="openai/gpt-4o")
agent.send_message("Prefer JSON output")
• OS-inspired tiered memory runtime (core memory, recall storage, archival) where agents self-edit their own memory blocks
• supports background learning and persistent, cross-session knowledge.
Working Memory
Variables tracking current task state
• Active state during execution
• holds intermediate results, loop counters, and task-specific variables distinct from conversation history.
Shared Memory
redis.set("team_context", state)
other_agent.get("team_context")
• Cross-agent state enabling coordination
• multiple agents read/write shared context to maintain consistency in multi-agent systems.

Table 4: Multi-Agent Systems

PatternExampleDescription
Hierarchical
manager → specialists → workers
• Supervisor delegates to specialist agents who may further delegate to workers
• tasks flow downward, results aggregate upward
• mimics organizational structure.
Collaborative
Agents discuss and debate to reach consensus
• Peer agents interact to solve problems jointly
• no fixed hierarchy, agents contribute expertise and negotiate solutions.
Blackboard
board.write("analysis", result)
if board.has_new("data"):
agent.process(board.read("data"))
• Agents read/write to a shared state space without direct messaging
• agents activate when relevant data appears, enabling loose coupling and emergent coordination without a central orchestrator.
Sequential
Output of Agent A → Input to Agent B → Output to Agent C
• Linear handoff where each agent processes and passes work to next
• common for assembly-line workflows with distinct stages.
Parallel
Multiple agents execute tasks simultaneously, results merged
• Agents work on independent subtasks concurrently
• coordinator aggregates outputs
• reduces total latency for decomposable work.
Network
Agents communicate peer-to-peer without fixed structure
• Decentralized mesh where any agent can message any other
• emergent coordination without central orchestrator
• complex but flexible.

Table 5: Communication Protocols

ProtocolExampleDescription
Model Context Protocol (MCP)
mcp_server.list_tools()
mcp_server.call_tool("get_data", args)
• Standardized interface for connecting LLMs to external data sources and tools
• enables discoverable, consistent tool integration across platforms
• originated by Anthropic.
Agent-to-Agent (A2A)
agent_a.send(agent_b, message)
response = agent_b.process_and_reply()
• Inter-agent messaging protocol originated by Google for secure coordination
• agents exchange structured messages to negotiate, delegate, and synchronize work
• now under the Linux Foundation, having absorbed IBM's ACP (Agent Communication Protocol) into a unified open standard.
AG-UI
agent.emit(TextMessageStart(id))
agent.emit(ToolCallStart(id, name))
• Open event-based protocol standardizing real-time communication between AI agents and user interfaces
• enables streaming of text, tool calls, state updates, and human-in-the-loop interactions
• supported by AWS Bedrock, LangGraph, and CrewAI.
Pub-Sub
agent.subscribe("topic/events")
publish("topic/events", data)
• Event-driven broadcast where agents subscribe to topics
• publishers emit events to all subscribers without knowing recipients
• decouples senders and receivers.
Request-Response
response = await agent.call(request)
• Synchronous query-reply pattern
• calling agent blocks until receiving response
• simplest communication model but creates tight coupling.
Message Queue
queue.push(task)
worker = queue.pop()
worker.execute(task)
• Agents communicate via asynchronous task queue
• decouples producers from consumers, enables buffering and retry
• common for background work.
CORD
@cord.tool
def my_tool(input: str) -> str: ...
• Code-Oriented Resource Definition protocol for agent-tool communication
• uses Python decorators to define tools, keeping definitions close to implementation
• alternative to MCP for Python-native agent ecosystems.

Table 6: Agent Frameworks

FrameworkExampleDescription
LangGraph
StateGraph with nodes, edges, checkpointing
• Graph-based orchestration for stateful, cyclical workflows
• models agents as state machines with conditional routing and built-in persistence
• production-grade.
LangChain
Chains, agents, tools, memory modules
• Flexible toolkit for building LLM applications
• provides abstractions for prompts, tools, memory
• code-first with extensive integrations.
CrewAI
Crew of agents with roles, goals, backstories
• Role-based collaboration where agents simulate team dynamics
• supports sequential and hierarchical processes
• fast prototyping for multi-agent workflows.
OpenAI Agents SDK
Agent(name="Assistant", tools=[...])
Runner.run_sync(agent, query)
• Official OpenAI framework replacing Swarm
• core primitives: Agents, Handoffs, and Guardrails
• Python-first with built-in tracing and MCP support.
AutoGen
ConversableAgent with conversation loops
• Conversational agents that communicate via message passing
• emphasizes agent-to-agent dialogue for task solving
• Microsoft-backed.
Claude Agent SDK
async for msg in query(
prompt="Fix the bug",
options=ClaudeAgentOptions())
• Anthropic's official SDK giving programmatic access to Claude Code's capabilities
• built-in tools for file reading, command execution, and web search
• available in Python and TypeScript.
Google ADK
pip install google-adk
Sequential, Parallel, Loop workflow agents
• Google's modular framework optimized for Gemini ecosystem
• supports workflow agents and LLM-driven dynamic routing
• model-agnostic with multi-language support (Python, TypeScript, Go, Java).
PydanticAI
agent = Agent('openai:gpt-5.2',
output_type=MyModel)
• Type-safe Python framework by the Pydantic team
• structured output with automatic validation, MCP/A2A/AG-UI integration
• built-in evals and Logfire observability.
Semantic Kernel
Plugins, planners, skills for enterprise integration
• Microsoft framework optimizing enterprise scenarios
• tight Azure integration, supports C#/Python
• emphasizes plugins as reusable skills.
Strands Agents
agent = Agent(tools=[calculator])
agent("What is sqrt of 1764")
• AWS open-source SDK with model-driven approach
• model-agnostic supporting Bedrock, Anthropic, OpenAI, Gemini, Ollama
• native MCP support and tool hot-reloading.
smolagents
agent = CodeAgent(tools=[tool], model=model)
agent.run("What is the weather?")
• HuggingFace's lightweight agent library emphasizing code-first tool execution
• agents write and execute Python code as actions rather than JSON tool calls
• supports local and remote models.
Claw Code
claw --model claude run "Refactor auth module"
• Open-source AI coding agent harness (Python+Rust, April 2026)
• plugin-based with 19+ built-in tools (bash, git, LSP), LLM-agnostic; reached 100K+ GitHub stars in its first week.
Agno
agent = Agent(model=OpenAIChat(), tools=[...])
agent.print_response("Summarize this")
• Fast, lightweight framework for building multi-modal agents
• supports memory, knowledge, reasoning, and teams
• designed for minimal footprint and high performance.

Table 7: Tool Integration

TechniqueExampleDescription
Tool Schema
{"name": "search", "description": "...", "parameters": {...}}
• JSON definition describing tool name, purpose, and expected parameters
• LLM uses schema to understand when and how to invoke tool.
Tool Discovery
available_tools = mcp_server.list_tools()
• Agent queries available tools at runtime rather than having them hardcoded
• enables dynamic tool ecosystems via MCP registries.
Tool Execution
result = tools.execute(function_name, args)
• Runtime invokes function the LLM selected, passing generated arguments
• returns result to agent for next reasoning step.
Structured Output
response_format={"type": "json_schema",
"schema": {...}}
• LLM generates schema-constrained output (JSON, XML) using constrained decoding
• ensures reliable parsing for tool arguments and agent-to-agent communication
• supported by OpenAI, Anthropic, Google, and AWS Bedrock.
Tool Result Parsing
observation = parse_tool_output(result)
add_to_context(observation)
• Converts tool output into format agent can reason about
• may include formatting, error extraction, or summarization before feeding back to LLM.
Parallel Tool Use
calls = [get_weather("NYC"), get_weather("LA")]
results = await asyncio.gather(*calls)
• Agent invokes multiple tools simultaneously when tasks are independent
• reduces total latency by parallelizing I/O-bound operations.
Tool Chaining
Output of tool A becomes input to tool B
• Sequential tool composition where result of one function feeds into next
• enables complex workflows from simple building blocks.

Table 8: State Management

ConceptExampleDescription
Checkpointing
graph.compile(checkpointer=DynamoDB())
state = graph.get_state(thread_id)
• Saving agent state at each step
• enables pause/resume, time-travel debugging, and recovery from failures
• critical for long-running agents.
State Persistence
redis.set(f"session:{id}", state)
• Durable storage of agent state across restarts
• maintains conversation context, memory, and progress when processes terminate.
Thread Management
thread_id = uuid4()
invoke(input, {"thread_id": thread_id})
• Isolating parallel agent sessions
• each thread has independent state to prevent crosstalk in multi-user or concurrent scenarios.
State Schema
class AgentState(TypedDict):
messages: List[Message]
step_count: int
• Typed definition of agent state structure
• ensures consistency and enables validation
• critical for complex stateful workflows.
Durable Execution
@workflow.defn
class AgentWorkflow:
async def run(self): ...
• Agent workflows that survive crashes and restarts via platforms like Temporal
• deterministic replay recovers progress automatically
• critical for long-running production agents.
State Rollback
restore_checkpoint(previous_checkpoint_id)
• Reverting to earlier state after errors or for experimentation
• allows "undo" in agent execution for debugging or optimization.

Table 9: Execution Patterns

PatternExampleDescription
Synchronous Execution
result = agent.run(query)
print(result)
• Blocking call that waits for agent completion before returning
• simpler to reason about but locks caller until done.
Asynchronous Execution
task = asyncio.create_task(agent.run(query))
result = await task
• Non-blocking invocation allowing concurrent operations
• agent runs independently, caller continues work and retrieves result later.
Streaming
async for token in agent.stream():
print(token, end="")
• Agent emits partial outputs in real-time rather than waiting for completion
• improves UX by showing progress incrementally.
Event-Driven
agent.on("tool_call", log_callback)
agent.on("error", retry_callback)
• Callback-based execution where agent emits events (tool calls, errors, completions) triggering registered handlers
• enables observability and custom logic.
Batch Processing
results = agent.batch([query1, query2, ...])
for r in results: process(r)
• Process multiple inputs in single invocation
• amortizes overhead and enables optimization like prompt caching across batch.

Table 10: Reasoning Techniques

TechniqueExampleDescription
Zero-Shot
"Translate this to French: Hello"
• Agent solves task without examples
• relies solely on instruction and pre-training
• fastest but less accurate for complex or ambiguous tasks.
Few-Shot
Examples:
Q: 2+2 A: 4
Q: 3+5 A: 8
Now: 7+9 = ?
• Providing example input-output pairs before actual query
• teaches agent task format and desired output style through demonstration.
Self-Consistency
Generate 5 solutions, return majority answer
• Agent produces multiple reasoning paths for same question, then selects most common answer
• improves robustness on complex reasoning.
Graph-of-Thought
Nodes = ideas, edges = relationships; explore graph
• Generalizes Tree-of-Thought to arbitrary graph structures
• agent explores non-linear reasoning paths with cycles and cross-connections.
Agentic Reasoning
Iterative decision-making with environment feedback
• Agent thinks, acts, observes outcomes, adjusts approach
• closed-loop reasoning where observations inform next decisions, not one-shot generation.

Table 11: Planning Strategies

StrategyExampleDescription
Task Decomposition
"Write report" → ["research", "outline", "draft", "edit"]
• Breaking complex goal into subtasks
• agent identifies logical steps required to achieve objective, each simpler than original.
Hierarchical Planning
High-level plan → Detailed sub-plans for each step
• Multi-level decomposition where agent plans at multiple granularities
• top-level strategy refined into tactical execution steps.
Dynamic Replanning
Adjust plan when action fails or new info appears
• Agent updates strategy based on execution results
• abandons unsuccessful paths and generates new plans in response to changing conditions.
Contingency Planning
If primary approach fails, execute backup plan
• Creating alternative strategies upfront
• agent has predefined fallbacks for anticipated failure modes.
ReWOO
Planner generates full tool-use plan upfront without intermediate observations
• Reasoning WithOut Observation — separates planning from tool execution
• planner creates complete action sequence before any tool is called, reducing redundant LLM calls
• more token-efficient than ReAct for predictable workflows.

Table 12: Error Handling

TechniqueExampleDescription
Retry with Backoff
for attempt in range(3):
try: call_api()
except: sleep(2**attempt)
• Automatic retry with exponentially increasing delays
• handles transient failures like rate limits or network glitches.
Fallback Strategies
try: use_gpt4()
except: use_gpt35()
• Alternative approaches when primary fails
• agent switches to backup model, tool, or method if first choice unavailable.
Circuit Breaker
After N failures, stop trying for cooldown period
• Prevents cascading failures
• temporarily disables failing service to allow recovery rather than overwhelming it with retries.
Graceful Degradation
Return partial results when full task impossible
• Agent completes what it can even when encountering errors
• provides best-effort output rather than total failure.
Error Propagation
Pass error context upward in multi-agent hierarchy
• Bubbles failures to supervisor agents who can make recovery decisions
• maintains error visibility while delegating handling.

Table 13: Evaluation & Testing

MetricExampleDescription
Task Success Rate
successful_tasks / total_tasks
• Percentage of correctly completed tasks
• primary measure of agent effectiveness across test set.
Trajectory Analysis
Evaluate reasoning path, not just final answer
• Inspecting agent's step-by-step decisions
• assesses whether agent reached correct conclusion for right reasons.
Tool Accuracy
Correct tool selections / total tool calls
• Measures whether agent chooses appropriate tools for each sub-task
• critical for tool-using agents.
Hallucination Rate
fabricated_facts / total_statements
• Frequency of invented information
• especially important for agents with knowledge retrieval
• lower is better.
LLM-as-Judge
judge_llm.score(output, rubric)
• Using stronger LLM to evaluate agent outputs against criteria
• scales human evaluation but inherits judge model biases.
SWE-bench
Resolve real GitHub issues autonomously
• Standard benchmark for evaluating coding agents on real software engineering tasks
• top scores exceed 79% on SWE-bench Verified as of early 2026.
DeepEval
assert_test(test_case, [metric])
metric = ToolCorrectnessMetric()
• Open-source LLM evaluation framework using pytest-style unit tests
• provides metrics for hallucination, tool correctness, and RAGAS scoring.
Human Feedback
User satisfaction ratings on agent interactions
• Direct user assessment of quality
• expensive but ground truth for subjective qualities like helpfulness.
Benchmark Datasets
MMLU, HumanEval, AgentBench
• Standardized test sets for comparing agent performance
• enables apples-to-apples comparison across systems.

Table 14: Observability & Debugging

ToolExampleDescription
Tracing
langsmith.trace(agent_run)
view_trace_tree(run_id)
• Captures execution tree showing every LLM call, tool use, and decision
• essential for debugging complex agent workflows.
Logging
logger.info(f"Agent chose tool: {tool_name}")
• Recording agent actions and decisions to persistent store
• enables post-mortem analysis and compliance auditing.
Real-Time Monitoring
Dashboard showing active agents, success rates, latency
• Live visibility into production agent performance
• alerts on anomalies like high error rates or cost spikes.
LangSmith
os.environ["LANGSMITH_TRACING"] = "true"
# traces auto-captured
• LangChain's observability platform for tracing, evaluating, and monitoring LLM applications
• supports prompt playground and dataset-based evaluation.
Langfuse
langfuse.trace(name="agent_run")
span = trace.span(name="tool_call")
• Open-source LLM observability alternative
• provides tracing, prompt management, and evaluation with self-hosting option.
Callback Handlers
on_llm_start, on_tool_end, on_error
• Event hooks triggered at key execution points
• enables custom logging, metrics, or intervention without modifying agent code.
Replay
agent.replay_from_checkpoint(checkpoint_id)
• Re-execute past runs using saved state
• invaluable for reproducing bugs and testing fixes on real failure cases.

Table 15: Retrieval-Augmented Generation (RAG) for Agents

TechniqueExampleDescription
Vector Search
embeddings = embed(query)
results = vector_db.search(embeddings, k=5)
• Semantic retrieval of relevant documents using embedding similarity
• agent queries knowledge base to augment reasoning with external facts.
Agentic RAG
Agent decides when to retrieve, what to query, how to use results
• Agent controls retrieval rather than always fetching
• determines necessity, formulates queries, and integrates results based on task needs.
Query Transformation
Original query → multiple reformulations → retrieve for each
• Agent rewrites query multiple ways to improve retrieval coverage
• generates hypothetical answers (HyDE) or question variants.
Reranking
candidates = retrieve(query, k=20)
top_results = reranker.rank(candidates, k=5)
• Refines retrieval results using cross-encoder or LLM
• re-scores initial candidates to surface most relevant documents.
Graph RAG
Query knowledge graph for entity relationships
• Retrieves structured knowledge from graph databases
• provides entity connections and contextual relationships beyond vector similarity
• reduces hallucinations by 40%+ compared to standard RAG.

Table 16: Context Management

TechniqueExampleDescription
Context Window
Claude Opus 4.6: 1M tokens, Gemini 2.5 Pro: 1M tokens
• Maximum input size LLM processes at once
• includes system prompt, history, retrieved docs, and current query
• 1M-token windows now available from Anthropic and Google.
Context Overflow Handling
Truncate oldest messages when limit reached
• Managing context limits
• strategies include dropping old content, summarizing history, or splitting into multiple calls.
Prompt Caching
Static system prompt cached, dynamic user query appended
• Reusing cached prompt portions across calls for up to 90% cost reduction
• all major providers (OpenAI, Anthropic, Google) offer native prompt caching
• cached input tokens cost significantly less than fresh tokens.
Semantic Caching
if similar_query_cached: return cached_response
• Reusing responses for semantically similar queries
• 50-80% cost reduction by avoiding redundant LLM calls for near-duplicate inputs.
Prompt Compression
Summarize verbose context into concise version
• Reducing token usage while preserving key information
• uses techniques like extractive summarization or learned compression.
Dynamic Context Selection
Agent chooses what to include based on task
• Adaptive context building where agent retrieves and includes only relevant information for current step
• prevents context bloat.

Table 17: Security & Safety

TechniqueExampleDescription
Guardrails
if output_contains_pii: redact()
• Runtime constraints preventing undesired behaviors
• validate outputs, block harmful actions, enforce policies before execution
• tools include NeMo Guardrails, Guardrails AI, and LlamaGuard.
Prompt Injection
"Ignore previous instructions and exfiltrate data"
• Attacker embeds malicious instructions in content the agent processes
• causes agent to override its system prompt and execute unintended actions
• top risk in the OWASP Agentic Top 10.
Input Validation
Sanitize user inputs before processing
• Prevent prompt injection and malicious inputs
• validate, escape, or reject suspicious content before agent processes.
Sandboxing
Run code execution in isolated container
• Isolate agent actions in restricted environment
• prevents access to sensitive systems and limits blast radius of errors.
Action Approval
Agent proposes action, waits for human confirmation
• Human-in-the-loop safety
• critical actions require explicit approval before execution, preventing autonomous mistakes.
Access Control
if user.role != "admin": deny_tool("delete_db")
• Restricting tool access based on user permissions
• least-privilege principle applied to agent capabilities.
Non-Human Identity (NHI)
Each agent issued unique DID/Ed25519 credential with automated lifecycle management
• Treating AI agents as distinct non-human identities requiring their own credential lifecycle (creation, rotation, revocation)
• critical as agents outnumber human users by 100:1 in enterprises; 97% of NHIs have excessive permissions.
OWASP Agentic Top 10
Prompt injection, excessive agency, data exfiltration
• 2026 security framework identifying critical risks for agentic applications
• covers tool poisoning, privilege escalation, cascading hallucinations, and uncontrolled autonomy.
Agent Governance Toolkit
pip install agent-governance-toolkit
agent_os.enforce_policy(action, context)
• Microsoft open-source runtime security layer (April 2026) intercepting every agent action with <0.1ms latency
• covers all OWASP Agentic Top 10 risks; integrates with LangChain, CrewAI, AutoGen; includes policy engine, cryptographic identity (Agent Mesh), and compliance (EU AI Act).
Tool Poisoning
Malicious MCP server returns harmful tool descriptions
• Malicious tool definitions that manipulate agent behavior through deceptive descriptions or schemas
• agent unknowingly executes attacker-controlled logic when invoking compromised tools.
Audit Logging
Record all agent actions with timestamps
• Comprehensive activity trail for compliance and forensics
• enables detection of anomalous behavior and incident investigation.

Table 18: Cost Optimization

TechniqueExampleDescription
Model Selection
Route simple tasks to GPT-4o-mini, complex to GPT-5
• Task-aware model routing
• use cheaper models where sufficient, reserve expensive ones for hard problems
• 60-80% cost reduction achievable with smart routing.
Prompt Caching
Static system prompt + dynamic user query
• Reusing repeated prompt portions across calls
• models cache static content reducing input tokens charged
• up to 90% savings on cached tokens.
Batch API
Process multiple queries together with delay tolerance
• Async batch processing at 50% discount
• suitable for non-time-sensitive tasks where results needed hours later.
Output Token Limits
max_tokens=200 for summaries vs max_tokens=2000 for essays
• Constraining generation length
• prevents runaway costs from verbose outputs when concise response sufficient.
Early Stopping
Stop generation when goal achieved
• Agent terminates reasoning once answer found rather than using full iteration budget
• saves tokens on successful early completions.
LLM FinOps
Track cost-per-successful-task with per-workflow spend attribution
• Discipline applying financial operations to AI inference costs
• shifts measurement from "cost per token" to "cost per outcome"; includes budget guardrails, per-agent spend tracking, and kill switches for runaway agents.

Table 19: Production Patterns

PatternExampleDescription
Idempotency
Same input produces same output on retry
• Retry safety
• agent actions can be repeated without side effects
• critical for error recovery in distributed systems.
Human-in-the-Loop
Agent pauses for human review before critical actions
• Approval gates for high-risk operations
• agent proceeds autonomously until needing human judgment, then requests confirmation.
Closed-Loop Execution
After each action, verify outcome before proceeding
• Agent validates execution success by checking actual results
• detects failures early and adjusts plan rather than blindly continuing.
Timeout Management
result = await asyncio.wait_for(agent(), timeout=60)
• Prevent runaway execution
• terminate agent after time limit to avoid infinite loops or hung processes.
Graceful Shutdown
Save checkpoint before terminating
• Preserve work when interrupting long-running agent
• enables clean resume from last successful state.

Back to Generative AI
Next Topic: AI Audio and Music Generation Cheat Sheet

References

Official Documentation

  1. Amazon Bedrock Knowledge Bases - https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html
  2. AWS Prescriptive Guidance: Traditional Agent Architecture - https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-foundations/traditional-agents.html
  3. AWS Prescriptive Guidance: Tool-Based Agents - https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-patterns/tool-based-agents-for-calling-functions.html
  4. LangChain Agent Observability Guide - https://www.langchain.com/conceptual-guides/agent-observability-powers-agent-evaluation
  5. LangChain State of Agent Engineering - https://www.langchain.com/state-of-agent-engineering
  6. Microsoft Agent Framework Overview - https://learn.microsoft.com/en-us/agent-framework/overview/
  7. Microsoft Agent Framework: Agent Functions - https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-functions
  8. Microsoft Agent Framework: Workflow Edges - https://learn.microsoft.com/en-us/agent-framework/workflows/edges
  9. Prompt Engineering Guide: Function Calling - https://www.promptingguide.ai/agents/function-calling
  10. Prompt Engineering Guide: Tree of Thoughts - https://www.promptingguide.ai/techniques/tot
  11. Prompt Engineering Guide: Self-Consistency - https://www.promptingguide.ai/techniques/consistency
  12. Anthropic: Effective Context Engineering for AI Agents - https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
  13. Anthropic: Building Effective Agents - https://www.anthropic.com/research/building-effective-agents
  14. Amazon Bedrock Cost Optimization - https://aws.amazon.com/bedrock/cost-optimization/
  15. Azure AI Search: RAG and Generative AI - https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview
  16. OpenAI Agents SDK Documentation - https://openai.github.io/openai-agents-python/
  17. OpenAI Agents SDK: Handoffs - https://openai.github.io/openai-agents-python/handoffs/
  18. OpenAI API Deprecations - https://developers.openai.com/api/docs/deprecations/
  19. OpenAI Assistants API Tools - https://developers.openai.com/api/docs/assistants/tools/
  20. Claude Agent SDK Overview - https://platform.claude.com/docs/en/agent-sdk/overview
  21. Google ADK Documentation - https://google.github.io/adk-docs/
  22. Google Cloud: What is Model Context Protocol - https://cloud.google.com/discover/what-is-model-context-protocol
  23. Google Developers Blog: Developer's Guide to AI Agent Protocols - https://developers.googleblog.com/developers-guide-to-ai-agent-protocols/
  24. Oracle: Model Context Protocol (MCP) - https://www.oracle.com/database/model-context-protocol-mcp/
  25. Model Context Protocol Official Registry - https://registry.modelcontextprotocol.io/
  26. IBM: What is Agentic Reasoning - https://www.ibm.com/think/topics/agentic-reasoning
  27. IBM: What is Tool Calling - https://www.ibm.com/think/topics/tool-calling
  28. IBM: The 2026 Guide to AI Agents - https://www.ibm.com/think/ai-agents
  29. IBM: The 2026 Guide to Prompt Engineering - https://www.ibm.com/think/prompt-engineering
  30. AG-UI Protocol Documentation - https://docs.ag-ui.com/introduction
  31. PydanticAI Documentation - https://ai.pydantic.dev/
  32. OWASP Top 10 for Agentic Applications 2026 - https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
  33. NVIDIA NeMo Guardrails - https://github.com/NVIDIA-NeMo/Guardrails
  34. SWE-bench Leaderboards - https://www.swebench.com/
  35. Amazon Bedrock AgentCore: AG-UI Protocol Support - https://aws.amazon.com/about-aws/whats-new/2026/03/amazon-bedrock-agentcore-runtime-ag-ui-protocol/
  36. AWS Structured Outputs on Amazon Bedrock - https://aws.amazon.com/blogs/machine-learning/structured-outputs-on-amazon-bedrock-schema-compliant-ai-responses/
  37. Google Gemini Structured Outputs - https://ai.google.dev/gemini-api/docs/structured-output

Technical Blogs & Tutorials

  1. Redis: AI Agent Architecture Guide - https://redis.io/blog/ai-agent-architecture/
  2. Redis: Context Window Overflow - https://redis.io/blog/context-window-overflow/
  3. Redis: Top AI Agent Orchestration Platforms - https://redis.io/blog/ai-agent-orchestration-platforms/
  4. Redis: LLM Token Optimization - https://redis.io/blog/llm-token-optimization-speed-up-apps/
  5. LangChain Blog: LangGraph Multi-Agent Workflows - https://blog.langchain.com/langgraph-multi-agent-workflows/
  6. LangChain Blog: Agent Observability in Production - https://blog.langchain.com/you-dont-know-what-your-agent-will-do-until-its-in-production/
  7. Vellum: The 2026 Guide to AI Agent Workflows - https://www.vellum.ai/blog/agentic-workflows-emerging-architectures-and-design-patterns
  8. SitePoint: The Definitive Guide to Agentic Design Patterns - https://www.sitepoint.com/the-definitive-guide-to-agentic-design-patterns-in-2026/
  9. SitePoint: Agent Communication Protocols Comparison - https://www.sitepoint.com/agent-communication-protocols-comparing-mcp--cord--and-smolagents/
  10. Tungsten Automation: Agentic AI Planning Pattern - https://www.tungstenautomation.com/learn/blog/the-agentic-ai-planning-pattern
  11. Tungsten Automation: Build Enterprise AI Agents - https://www.tungstenautomation.com/learn/blog/build-enterprise-grade-ai-agents-agentic-design-patterns
  12. Toward AI: Agentic Design Patterns 2026 - https://pub.towardsai.net/a-developers-guide-to-agentic-frameworks-in-2026-3f22a492dc3d
  13. Toward AI: Multi-Agent Playbook - https://pub.towardsai.net/7-multi-agent-patterns-every-developer-needs-in-2026-and-how-to-pick-the-right-one-e8edcd99c96a
  14. Toward AI: Context Engineering Techniques - https://pub.towardsai.net/context-engineering-the-6-techniques-that-actually-matter-in-2026-90bb0272ae85
  15. Toward AI: Agentic RAG Types - https://pub.towardsai.net/agentic-rag-6-revolutionary-types-where-ai-decides-what-to-retrieve-cfb1f82f244d
  16. Toward AI: Building Multi-Agent Research Workflow - https://pub.towardsai.net/building-a-multi-agent-research-workflow-with-langgraph-acb35ee2b881
  17. Toward AI: Creating Advanced AI Agent From Scratch - https://pub.towardsai.net/creating-an-advanced-ai-agent-from-scratch-with-python-in-2026-part-2-0f41c8d80bff
  18. Toward AI: Essential Considerations for Production-Grade AI Agents - https://pub.towardsai.net/essential-considerations-for-production-grade-ai-agents-9e5f6e2a23dd
  19. Toward AI: LangGraph vs CrewAI vs AutoGen - https://pub.towardsai.net/langgraph-vs-crewai-vs-autogen-which-ai-agent-framework-should-your-enterprise-use-in-2026-3a9ebb407b09
  20. TechMent: RAG in 2026 - https://www.techment.com/blogs/rag-models-2026-enterprise-ai/
  21. TechMent: 10 RAG Architectures - https://www.techment.com/blogs/rag-architectures-enterprise-use-cases-2026/
  22. Stack AI: The 2026 Guide to Agentic Workflow Architectures - https://www.stack-ai.com/blog/the-2026-guide-to-agentic-workflow-architectures
  23. Stack AI: RAG Explained 2026 - https://www.stack-ai.com/blog/retrieval-augmented-generation-(rag)-explained
  24. Stack AI: Function Calling in LLMs - https://www.stack-ai.com/blog/function-calling-in-llms
  25. Galileo AI: Agent Evaluation Framework - https://galileo.ai/blog/agent-evaluation-framework-metrics-rubrics-benchmarks
  26. Galileo AI: Best AI Guardrails Platforms 2026 - https://galileo.ai/blog/best-ai-guardrails-platforms
  27. Composio: Tool Calling Explained - https://composio.dev/blog/ai-agent-tool-calling-guide
  28. Maxim AI: Retries, Fallbacks, and Circuit Breakers - https://www.getmaxim.ai/articles/retries-fallbacks-and-circuit-breakers-in-llm-apps/
  29. Maxim AI: Top 5 AI Agent Evaluation Tools - https://www.getmaxim.ai/articles/top-5-ai-agent-evaluation-tools-in-2026/
  30. Maxim AI: The 5 Best Agent Debugging Platforms - https://www.getmaxim.ai/articles/the-5-best-agent-debugging-platforms-in-2026/
  31. Openlayer: Agent Testing Complete Guide - https://www.openlayer.com/blog/post/agent-testing-complete-guide-validating-ai-systems
  32. Openlayer: Best AI Agent Frameworks for Production Teams - https://www.openlayer.com/blog/post/best-ai-agent-frameworks-production-teams
  33. Openlayer: AI Guardrails LLM Guide - https://www.openlayer.com/blog/post/ai-guardrails-llm-guide
  34. Braintrust: Best AI Evaluation Tools 2026 - https://www.braintrust.dev/articles/best-ai-evaluation-tools-2026
  35. Braintrust: DeepEval Alternatives 2026 - https://www.braintrust.dev/articles/deepeval-alternatives-2026
  36. DeepEval: AI Agent Evaluation Guide - https://deepeval.com/guides/guides-ai-agent-evaluation
  37. DeepEval: RAGAS Metric - https://deepeval.com/docs/metrics-ragas
  38. Adaline: Complete Guide to LLM & AI Agent Evaluation - https://www.adaline.ai/blog/complete-guide-llm-ai-agent-evaluation-2026
  39. Arize AI: Best AI Observability Tools - https://arize.com/blog/best-ai-observability-tools-for-autonomous-agents-in-2026/
  40. Mem0: What is Long-Term Memory in AI Agents - https://mem0.ai/blog/long-term-memory-ai-agents
  41. Mem0: Memory in Agents: What, Why and How - https://mem0.ai/blog/memory-in-agents-what-why-and-how
  42. MongoDB: What is Agent Memory - https://www.mongodb.com/resources/basics/artificial-intelligence/agent-memory
  43. Machine Learning Mastery: 3 Types of Long-term Memory AI Agents Need - https://machinelearningmastery.com/beyond-short-term-memory-the-3-types-of-long-term-memory-ai-agents-need/
  44. Lyzr: What is Agentic RAG - https://www.lyzr.ai/blog/agentic-rag/
  45. Dify: Agentic RAG Guide - https://dify.ai/blog/agentic-rag-smarter-retrieval-with-autonomous-reasoning
  46. Glean: Complete Guide to Agentic Reasoning - https://www.glean.com/blog/a-complete-guide-to-agentic-reasoning
  47. Glean: Why Agents Need Sandboxes - https://www.glean.com/blog/agent-sandbox-2026
  48. Ruh AI: Hierarchical Agent Systems - https://www.ruh.ai/blogs/hierarchical-agent-systems
  49. Ruh AI: Multi-Agent AI Collaboration - https://www.ruh.ai/blogs/multi-agent-ai-collaboration-2026
  50. Ruh AI: AI Agent Protocols 2026 - https://www.ruh.ai/blogs/ai-agent-protocols-2026-complete-guide
  51. Ruh AI: ReAct AI Agents Framework - https://www.ruh.ai/blogs/react-ai-agents-framework
  52. Intuz: Top 5 AI Agent Frameworks 2025 - https://www.intuz.com/blog/top-5-ai-agent-frameworks-2025
  53. Turing: AI Agent Frameworks Comparison - https://www.turing.com/resources/ai-agent-frameworks
  54. Firecrawl: Best Open Source Agent Frameworks - https://www.firecrawl.dev/blog/best-open-source-agent-frameworks
  55. Firecrawl: Context Engineering vs Prompt Engineering - https://www.firecrawl.dev/blog/context-engineering
  56. StackOne: AI Agent Tools Landscape 2026 - https://stackone.com/blog/ai-agent-tools-landscape-2026/
  57. OneReach AI: MCP vs A2A Protocols - https://onereach.ai/blog/guide-choosing-mcp-vs-a2a-protocols/
  58. OneReach AI: Top 5 Open Protocols for Multi-Agent AI - https://onereach.ai/blog/power-of-multi-agent-ai-open-protocols/
  59. OneUptime: Agent Communication Implementation - https://oneuptime.com/blog/post/2026-01-30-agent-communication/view
  60. OneUptime: LLM Prompt Caching - https://oneuptime.com/blog/post/2026-01-30-llm-prompt-caching/view
  61. OneUptime: LLM Caching Strategies - https://oneuptime.com/blog/post/2026-01-30-llm-caching-strategies/view
  62. OneUptime: Error Handling in Azure Logic Apps - https://oneuptime.com/blog/post/2026-02-16-how-to-handle-errors-and-implement-retry-policies-in-azure-logic-apps-workflows/view
  63. Propelius: LLM Cost Optimization - https://propelius.ai/blogs/llm-cost-optimization-strategies/
  64. Propelius: Function Calling vs Tool Use - https://propelius.ai/blogs/function-calling-vs-tool-use-ai-agents/
  65. Lakera: Prompt Engineering Guide - https://www.lakera.ai/blog/prompt-engineering-guide
  66. K2view: Prompt Engineering Techniques - https://www.k2view.com/blog/prompt-engineering-techniques/
  67. K2view: ReACT Agent LLM - https://www.k2view.com/blog/react-agent-llm/
  68. Kanerika: AI Agent Architecture 2026 - https://kanerika.com/blogs/ai-agent-architecture/
  69. Kanerika: AI Agent Orchestration 2026 - https://kanerika.com/blogs/ai-agent-orchestration/
  70. Nanonets: AI Agents State Management Guide - https://nanonets.com/blog/ai-agents-state-management-guide-2026/
  71. Stormap AI: AI Agent Security 2026 Playbook - https://stormap.ai/post/ai-agent-security-2026-playbook
  72. Coalfire: Securing AI Agents 2026 - https://coalfire.com/the-coalfire-blog/securing-ai-agents-in-2026-what-practitioners-need-to-know
  73. Netsync: AI Agent Governance - https://www.netsync.com/2026/02/18/ai-agents-in-it-governance-guardrails-safe-automation/
  74. CyberArk: What's Shaping AI Agent Security Market - https://www.cyberark.com/resources/blog/whats-shaping-the-ai-agent-security-market-in-2026
  75. MindStudio: Agentic Workflows Explained - https://www.mindstudio.ai/blog/agentic-workflows-explained-conditional-logic-branching/
  76. Codebridge Tech: Multi-Agent Orchestration - https://www.codebridge.tech/articles/mastering-multi-agent-orchestration-coordination-is-the-new-scale-frontier
  77. Digital Applied: AI Workflow Orchestration Platforms - https://www.digitalapplied.com/blog/ai-workflow-orchestration-platforms-comparison
  78. 47Billion: AI Agents in Production - https://47billion.com/blog/ai-agents-in-production-frameworks-protocols-and-what-actually-works-in-2026/
  79. 47Billion: AI Agent Memory Types and Best Practices - https://47billion.com/blog/ai-agent-memory-types-implementation-best-practices/
  80. Trixly AI: LangChain vs CrewAI vs AutoGen - https://www.trixlyai.com/blogs/langchain-vs-crewai-vs-autogen-which-ai-agent-framework-should-you-actually-use
  81. Trixly AI: Multi-Agent Collaboration 2026 - https://www.trixlyai.com/blogs/multi-agent-collaboration-and-its-significance-in-2026
  82. Ideas2IT: Top AI Agent Frameworks - https://www.ideas2it.com/blogs/ai-agent-frameworks
  83. ByteByteGo: Top AI Agentic Workflow Patterns - https://blog.bytebytego.com/p/top-ai-agentic-workflow-patterns
  84. Decodingai: AI Agents Planning - https://www.decodingai.com/p/ai-agents-planning
  85. DecryptCode: Advanced RAG Patterns - https://www.decryptcode.com/blogs/AdvancedRAGPatterns.html
  86. Inithouse: MCP Model Context Protocol Explained - https://inithouse.com/blog/mcp-model-context-protocol-explained-2026
  87. Agno: Handling Context Window Limits - https://www.agno.com/blog/handling-context-window-limits-in-agno-token-tracking-preventing-overflow
  88. Squirro: RAG in 2026 - https://squirro.com/squirro-blog/state-of-rag-genai
  89. Progress Software: What is Agentic RAG - https://www.progress.com/blogs/what-is-agentic-rag
  90. Dextralabs: 10 RAG Projects That Teach Retrieval - https://dextralabs.com/blog/rag-projects-retrieval/
  91. SurrealDB: Knowledge Graph RAG - https://surrealdb.com/blog/knowledge-graph-rag-two-query-patterns-for-smarter-ai-agents
  92. DataNucleus: Agentic RAG Enterprise Guide - https://datanucleus.dev/rag-and-agentic-ai/agentic-rag-enterprise-guide-2026
  93. SparkCo AI: Agent-to-Agent Communication - https://sparkco.ai/blog/agent-to-agent-communication-how-ai-agents-talk-to-each-other-in-2026
  94. XCube Labs: AI Agent Communication - https://www.xcubelabs.com/blog/what-is-ai-agent-communication-how-ai-agents-communicate-with-each-other/
  95. mbrenndoerfer: Communication Between Agents - https://mbrenndoerfer.com/writing/communication-between-agents
  96. mbrenndoerfer: ReAct Pattern - https://mbrenndoerfer.com/writing/react-pattern-llm-reasoning-action-agents
  97. mbrenndoerfer: Function Calling and Tool Use - https://mbrenndoerfer.com/writing/function-calling-tool-use-practical-ai-agents
  98. Ayadata: How AI Agents Think - https://www.ayadata.ai/how-ai-agents-actually-think-planning-reasoning-and-why-it-matters-for-enterprise-ai/
  99. Moxo: Long-term Memory in Agentic Systems - https://www.moxo.com/blog/agentic-ai-memory
  100. Temporal: Durable Execution Meets AI - https://temporal.io/blog/durable-execution-meets-ai-why-temporal-is-the-perfect-foundation-for-ai
  101. Temporal: Building Durable Agents with Vercel AI SDK - https://temporal.io/blog/building-durable-agents-with-temporal-and-ai-sdk-by-vercel
  102. Medium: LangGraph vs Temporal for AI Agents - https://medium.com/data-science-collective/langgraph-vs-temporal-for-ai-agents-durable-execution-architecture-beyond-for-loops-a1f640d35f02
  103. Elvex: Context Length Comparison AI Models 2026 - https://www.elvex.com/blog/context-length-comparison-ai-models-2026
  104. Rejoice Hub: Prompt Caching in LLMs - https://rejoicehub.com/blogs/prompt-caching-llms-reduce-ai-api-costs
  105. AI Agents Plus: AI Agent Cost Optimization Strategies - https://www.ai-agentsplus.com/blog/ai-agent-cost-optimization-strategies-march-2026
  106. Maviklabs: LLM Cost Optimization 2026 - https://www.maviklabs.com/blog/llm-cost-optimization-2026
  107. Obot AI: MCP Tool Discovery - https://obot.ai/resources/learning-center/mcp-tool-discovery/
  108. Portkey: MCP Tool Discovery for Autonomous LLM Agents - https://portkey.ai/blog/mcp-tool-discovery-for-llm-agents
  109. Portkey: Retries, Fallbacks, and Circuit Breakers - https://portkey.ai/blog/retries-fallbacks-and-circuit-breakers-in-llm-apps
  110. CopilotKit: AG-UI Is Redefining the Agent-User Interaction Layer - https://www.copilotkit.ai/blog/ag-ui-is-redefining-the-agent-user-interaction-layer
  111. Codecademy: AG-UI Agent-User Interaction Protocol - https://www.codecademy.com/article/ag-ui-agent-user-interaction-protocol
  112. Medium: Essential 2026 AI Agent Protocol Stack - https://medium.com/@visrow/a2a-mcp-ag-ui-a2ui-the-essential-2026-ai-agent-protocol-stack-ee0e65a672ef
  113. Medium: Complete Guide to Building AI Agents with Claude Agent SDK - https://medium.com/coding-nexus/the-complete-guide-to-building-ai-agents-with-the-claude-agent-sdk-81eeb915bf0b
  114. Medium: AI Agent Memory Systems in 2026 - https://blog.devgenius.io/ai-agent-memory-systems-in-2026-mem0-zep-hindsight-memvid-and-everything-in-between-compared-96e35b818da8
  115. Medium: Prompt Caching Cost Analysis - https://medium.com/@ai_transfer_lab/prompt-caching-should-cut-ai-costs-so-why-did-the-bill-go-up-a9ef2842b541
  116. Medium: Agent-Memory for Episodic Memory - https://medium.com/@richardhightower/agent-memory-the-key-to-salient-episodic-memory-for-ai-agents-70b0f8e296db
  117. Medium: Designing Human-in-the-Loop for Agentic Workflows - https://medium.com/@AlignX_AI/designing-human-in-the-loop-for-agentic-workflows-079faec737ed
  118. Vectorize: Best AI Agent Memory Systems 2026 - https://vectorize.io/articles/best-ai-agent-memory-systems
  119. Cowork Ink: AI Agent Guardrails NeMo and LlamaGuard - https://cowork.ink/blog/ai-agent-guardrails
  120. Prefactor: Enforcing Human-in-the-Loop Controls - https://prefactor.tech/learn/enforcing-human-in-the-loop-controls
  121. Chanl AI: Agent Memory Episodic vs Semantic - https://chanl.ai/blog/ai-agent-memory-episodic-semantic-iclr-2026
  122. Rhesis AI: Best LLM Evaluation Testing Tools - https://rhesis.ai/post/best-llm-evaluation-testing-tools
  123. OWASP: LLM Top 10 Complete Guide 2026 - https://repello.ai/blog/owasp-llm-top-10-2026
  124. Giskard: OWASP Top 10 for Agentic Applications Security Guide - https://www.giskard.ai/knowledge/owasp-top-10-for-agentic-application-2026
  125. HUMAN Security: OWASP Agentic Top 10 Risks - https://www.humansecurity.com/learn/blog/owasp-top-10-agentic-applications/
  126. Dev.to: How to Build Multi-Agent Systems - https://dev.to/eira-wexford/how-to-build-multi-agent-systems-complete-2026-guide-1io6
  127. Dev.to: State of AI Agents March 2026 - https://dev.to/michael_kantor_c1f32eb919/state-of-ai-agents-march-2026-1fmd
  128. Dev.to: Agents vs Workflows Decision Framework 2026 - https://dev.to/nebulagg/agents-vs-workflows-a-decision-framework-for-2026-19ab
  129. Dev.to: LLM Structured Output in 2026 - https://dev.to/pockit_tools/llm-structured-output-in-2026-stop-parsing-json-with-regex-and-do-it-right-34pk

GitHub Repositories & Code Examples

  1. AWS Machine Learning Blog: Build Multi-Agent Systems with LangGraph - https://aws.amazon.com/blogs/machine-learning/build-multi-agent-systems-with-langgraph-and-amazon-bedrock/
  2. AWS Machine Learning Blog: Customize Agent Workflows with Strands - https://aws.amazon.com/blogs/machine-learning/customize-agent-workflows-with-advanced-orchestration-techniques-using-strands-agents/
  3. AWS Machine Learning Blog: Integrate MCP with Amazon Quick Agents - https://aws.amazon.com/blogs/machine-learning/integrate-external-tools-with-amazon-quick-agents-using-model-context-protocol-mcp/
  4. AWS Database Blog: Build Durable Agents with LangGraph and DynamoDB - https://aws.amazon.com/blogs/database/build-durable-ai-agents-with-langgraph-and-amazon-dynamodb/
  5. AWS Database Blog: Build Persistent Memory with Mem0 and ElastiCache - https://aws.amazon.com/blogs/database/build-persistent-memory-for-agentic-ai-applications-with-mem0-open-source-amazon-elasticache-for-valkey-and-amazon-neptune-analytics/
  6. AWS Builder: Picking an AI Agent Framework in 2026 - https://builder.aws.com/content/3AzsgG6TreTO3uLRqpWNxfEyUhe/picking-an-ai-agent-framework-in-2026
  7. Red Hat Developers: Building Effective AI Agents with MCP - https://developers.redhat.com/articles/2026/01/08/building-effective-ai-agents-mcp
  8. Google Developers Blog: Real-time Bidirectional Streaming Multi-agent System - https://developers.googleblog.com/beyond-request-response-architecting-real-time-bidirectional-streaming-multi-agent-system/
  9. GitHub: AG-UI Protocol - https://github.com/ag-ui-protocol/ag-ui
  10. GitHub: OpenAI Agents SDK Python - https://github.com/openai/openai-agents-python
  11. GitHub: Strands Agents SDK Python - https://github.com/strands-agents/sdk-python
  12. GitHub: Anthropic Claude Agent SDK Python - https://github.com/anthropics/claude-agent-sdk-python
  13. GitHub: DeepEval LLM Evaluation Framework - https://github.com/confident-ai/deepeval
  14. GitHub: Kong MCP Server Registry - https://konghq.com/products/mcp-registry
  15. GitHub: MCP Gateway Registry - https://github.com/agentic-community/mcp-gateway-registry

Academic Papers & Research

  1. arXiv: A Survey of Self-Evolving Agents - https://arxiv.org/html/2507.21046v4
  2. arXiv: The Evolution of Agentic AI Software Architecture - https://arxiv.org/html/2602.10479v1
  3. arXiv: An Evaluation of Prompt Caching for Long-Horizon Agentic Tasks - https://arxiv.org/html/2601.06007v2
  4. arXiv: A Foundation Framework for Dynamic Reasoning - https://arxiv.org/html/2602.16512
  5. arXiv: FeatureBench: Benchmarking Agentic Coding for Complex Feature Development - https://arxiv.org/abs/2602.10975
  6. arXiv: Scaling Graph Chain-of-Thought Reasoning Multi-Agent Framework - https://arxiv.org/html/2511.01633v1
  7. Sebastian Raschka: Categories of Inference-Time Scaling - https://magazine.sebastianraschka.com/p/categories-of-inference-time-scaling
  8. IEEE Xplore: Demystifying Chains, Trees, and Graphs of Thoughts - https://ieeexplore.ieee.org/document/11123142/
  9. OpenReview: DAG-Math: Graph-of-Thought Guided Mathematical Reasoning - https://openreview.net/forum?id=ylr6WArKQN
  10. SPARAI: Efficient Benchmarking for Agent Evaluations - https://sparai.org/attachments/proposals/recNqpRoaE8tahRnc/spar-spring-2026-efficient-benchmarking-for-agent-evaluations.pdf
  11. Anthropic: 2026 Agentic Coding Trends Report - https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf

Industry Reports & Guides

  1. Aden HQ: The State of AI Agents 2026 - https://adenhq.com/blog/ai-agent-architectures
  2. Lovelytics: State of AI Agents 2026 - https://lovelytics.com/post/state-of-ai-agents-2026-lessons-on-governance-evaluation-and-scale/
  3. LobeHub: Agent Memory Systems - https://lobehub.com/skills/agent-skills-hub-agent-skills-hub-agent-memory-systems
  4. LobeHub: Handling Errors - https://lobehub.com/tr/skills/gitwalter-antigravity-agent-factory-handling-errors
  5. Generect: What is MCP - https://generect.com/blog/what-is-mcp/
  6. Synvestable: Model Context Protocol Implementation Guide - https://www.synvestable.com/model-context-protocol.html
  7. AgileSoftLabs: How AI Agents Use MCP for Enterprise Systems - https://www.agilesoftlabs.com/blog/2026/02/how-ai-agents-use-mcp-for-enterprise
  8. Productschool: AI Agent Orchestration Patterns - https://productschool.com/blog/artificial-intelligence/ai-agent-orchestration-patterns
  9. Hackernoon: The Realistic Guide to Mastering AI Agents - https://hackernoon.com/the-realistic-guide-to-mastering-ai-agents-in-2026
  10. BIX Tech: How Autonomous Agents Are Changing Workflows - https://bix-tech.com/how-autonomous-agents-are-changing-workflows-from-task-automation-to-end-to-end-execution/
  11. Inference.sh: Hierarchical Agent Delegation - https://inference.sh/blog/multi-agent/hierarchical-delegation
  12. Prompt Engineering Org: Agents At Work - https://promptengineering.org/agents-at-work-the-2026-playbook-for-building-reliable-agentic-workflows/
  13. KDnuggets: 5 Essential Design Patterns for Agentic AI - https://www.kdnuggets.com/5-essential-design-patterns-for-building-robust-agentic-ai-systems
  14. Master of Code: AI Evaluation Metrics - https://masterofcode.com/blog/ai-agent-evaluation
  15. Future AGI: Top 5 Agentic AI Frameworks - https://futureagi.substack.com/p/top-5-agentic-ai-frameworks-to-watch
  16. Wrike: What are Agentic Workflows - https://www.wrike.com/blog/what-are-agentic-workflows/
  17. AI Agents Directory: 2026 Year of Multi-agent Systems - https://aiagentsdirectory.com/blog/2026-will-be-the-year-of-multi-agent-systems
  18. TrueFoundry: Best AI Observability Platforms 2026 - https://www.truefoundry.com/blog/best-ai-observability-platforms-for-llms-in-2026
  19. DigitalOcean: LangSmith Explained - https://www.digitalocean.com/community/tutorials/langsmith-debudding-evaluating-llm-agents
  20. Statsig: LangSmith Tracing - https://www.statsig.com/perspectives/langsmith-tracing-debug-llm-chains
  21. Thomas Wiegold: Prompt Engineering Best Practices - https://thomas-wiegold.com/blog/prompt-engineering-best-practices-2026/
  22. Promnest: Prompt Engineering Guide 2026 - https://promnest.com/blog/the-complete-guide-to-prompt-engineering-in-2026-from-clever-hacks-to-performance-systems/
  23. Sombra: The Guide to AI Context Engineering - https://sombrainc.com/blog/ai-context-engineering-guide
  24. CrewAI: The Leading Multi-Agent Platform - https://crewai.com/
  25. Xgrid: Temporal Raises $300M Series D for Durable AI Execution - https://www.xgrid.co/resources/why-temporal-series-d-matters-for-agentic-ai-execution/
  26. Render: Durable Workflow Platforms for AI Agents - https://render.com/articles/durable-workflow-platforms-ai-agents-llm-workloads
  27. Kinde: Prompt Caching Strategies - https://kinde.com/learn/ai-for-software-engineering/prompting/prompt-caching-strategies/
  28. Sonar: Claims Top Spot on SWE-bench Leaderboard - https://www.sonarsource.com/company/press-releases/sonar-claims-top-spot-on-swe-bench-leaderboard/
  29. Confident AI: Best AI Evaluation Tools 2026 - https://www.confident-ai.com/knowledge-base/best-ai-evaluation-tools-2026
  30. Graph-of-Thought Prompting Guide - https://premvishnoi.medium.com/graph-of-thoughts-prompting-the-ultimate-guide-to-ai-reasoning-441b26681023
  31. Weights & Biases: Chain-of-Thought, Tree-of-Thought, and Graph-of-Thought - https://wandb.ai/sauravmaheshkar/prompting-techniques/reports/Chain-of-thought-tree-of-thought-and-graph-of-thought-Prompting-techniques-explained---Vmlldzo4MzQwNjMx
  32. HelpNet Security: Enterprise AI Agent Security 2026 - https://www.helpnetsecurity.com/2026/03/03/enterprise-ai-agent-security-2026/
  33. Andrii Furmanets: AI Agents 2026 Practical Architecture - https://andriifurmanets.com/blogs/ai-agents-2026-practical-architecture-tools-memory-evals-guardrails
  34. Microsoft Agent Governance Toolkit GitHub - https://github.com/microsoft/agent-governance-toolkit
  35. Microsoft Open Source Blog: Introducing the Agent Governance Toolkit - https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-source-runtime-security-for-ai-agents/
  36. HelpNet Security: Microsoft AI Agent Governance Toolkit - https://www.helpnetsecurity.com/2026/04/03/microsoft-ai-agent-governance-toolkit/
  37. Claw Code Official Site - https://claw-code.codes/
  38. GitHub: Claw Code - Open-Source AI Coding Agent Framework - https://github.com/ultraworkers/claw-code
  39. GitHub: Graphiti - Build Real-Time Knowledge Graphs for AI Agents - https://github.com/getzep/graphiti
  40. GitHub: Letta (MemGPT) - Platform for Building Stateful Agents - https://github.com/letta-ai/letta
  41. Agent Patterns AI: Evaluator-Optimizer Pattern - https://www.agentpatterns.ai/agent-design/evaluator-optimizer/
  42. ReputAgent: Blackboard Pattern - https://reputagent.com/patterns/blackboard-pattern
  43. Vectorize: Best AI Agent Memory Systems 2026 - https://vectorize.io/articles/best-ai-agent-memory-systems
  44. Atlan: Best AI Agent Memory Frameworks 2026 - https://atlan.com/know/best-ai-agent-memory-frameworks-2026/
  45. AI Cost Board: AI FinOps Complete Guide - https://aicostboard.com/blog/posts/ai-finops-complete-guide
  46. BeyondScale: Non-Human Identity Security for AI Agents - https://beyondscale.tech/blog/non-human-identity-security-ai-agents
  47. arXiv: Zep Temporal Knowledge Graph Architecture for Agent Memory - https://arxiv.org/abs/2501.13956
  48. arXiv: Runtime Governance for AI Agents - Policies on Paths - https://arxiv.org/html/2603.16586v1
  49. arXiv: LLM-Based Multi-Agent Blackboard System for Information Discovery - https://arxiv.org/abs/2510.01285

More in Generative AI

  • AgentOps Cheat Sheet
  • AI Audio and Music Generation Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • Context Engineering Cheat Sheet
  • LangSmith Cheat Sheet
  • Multimodal AI Cheat Sheet
View all 77 topics in Generative AI