Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

LangSmith Cheat Sheet

LangSmith Cheat Sheet

Back to Generative AI
Updated 2026-05-28
Next Topic: Large Language Models (LLMs) Cheat Sheet

LangSmith is a unified DevOps platform for developing, debugging, testing, deploying, and monitoring LLM applications and AI agents built by LangChain. It provides framework-agnostic observability with comprehensive tracing, evaluation datasets, online/offline evaluations, prompt management, human-in-the-loop workflows, and production monitoring to help teams move from prototype to production. In 2026, LangSmith expanded significantly with LangSmith Engine (autonomous failure clustering + PR proposals), SmithDB (purpose-built Rust/DataFusion database, up to 15x faster), Context Hub (versioned agent context management), LLM Gateway (runtime spend limits + PII redaction), and Sandboxes GA (hardware-virtualized microVMs for safe agent code execution). LangSmith's core differentiator remains end-to-end visibility into agent execution via traces, enabling developers to understand, evaluate, and continuously improve complex multi-step LLM and agent workflows.

What This Cheat Sheet Covers

This topic spans 22 focused tables and 163 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core ConceptsTable 2: Tracing and ObservabilityTable 3: Datasets and ExamplesTable 4: Offline EvaluationsTable 5: Online Evaluations and MonitoringTable 6: LangSmith EngineTable 7: Human Annotation WorkflowsTable 8: Prompt EngineeringTable 9: Context HubTable 10: Python SDKTable 11: JavaScript/TypeScript SDKTable 12: OpenTelemetry (OTEL) IntegrationTable 13: Framework IntegrationsTable 14: REST APITable 15: Production DeploymentTable 16: LangSmith Fleet (Agent Platform)Table 17: Self-Hosted LangSmithTable 18: Pricing and PlansTable 19: Comparison with AlternativesTable 20: Best PracticesTable 21: Common Use CasesTable 22: Advanced Features

Table 1: Core Concepts

The vocabulary of LangSmith — understanding these building blocks is prerequisite for everything else in the platform. The trace → run → feedback hierarchy maps directly to how observability data flows into evaluations, annotation queues, and Engine-driven improvements.

ConceptExampleDescription
Trace
Single request → LLM → retrieval → response
Top-level execution unit
• End-to-end execution path capturing full request lifecycle
• contains all runs/spans for a single request
• analogous to spans in distributed tracing (OpenTelemetry)
• includes input, output, metadata, timestamps, cost
Run (Span)
Individual LLM call, retriever step, or tool invocation within trace
• Individual step within a trace, similar to OpenTelemetry spans
• tracks single operation (LLM, chain, tool, retriever)
• includes token counts, latency, cost; nested for complex workflows.
Dataset
{"input": "What is AI?", "expected": "..."}
Versioned test-case collection
• Curated test cases for evaluation
• supports CSV, JSON, JSONL + file attachments (images, PDFs, audio, video)
• versioned — new version on every change; pin experiments to a specific version.
Experiment
Run application on dataset → compare v1 vs v2 prompts
• Evaluation run on a dataset producing scores and metrics (accuracy, latency, cost)
• supports comparison view for A/B testing and baseline pinning for regression detection
Evaluator
LLM-as-judge, code-based, or human reviewer
Scores outputs on criteria
• Scoring function for evaluation
• types: LLM-as-judge, code-based, human (annotation queues), composite (weighted multi-score)
• applied to experiments or online runs; now reusable across projects.

More in Generative AI

  • LangGraph Cheat Sheet
  • Large Language Models (LLMs) Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • ColBERT and Late Interaction Retrieval Cheat Sheet
  • LlamaIndex Cheat Sheet
  • pgvector for Postgres Vector Search Cheat Sheet
View all 95 topics in Generative AI