Structured output generation transforms LLM free-text responses into reliably typed, machine-readable formats (JSON, XML, Pydantic models) by constraining the model at the prompt, API, or decoding level. Modern providers now offer native schema enforcement that guarantees valid JSON matching a given JSON Schema, eliminating fragile regex post-processing. This cheat sheet covers every major approachβfrom cloud provider APIs and Python libraries to constrained decoding engines and agentic pipeline patternsβgiving you the right tool for each situation.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 84 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Concepts and Approaches
Overview of the four fundamental strategies for obtaining structured output from LLMs, ordered from simplest to most reliable.
| Concept | Example | Description |
|---|---|---|
"Respond ONLY with valid JSON: {\"name\": ..., \"age\": ...}" | Instruct the model to produce structured output via system/user prompt. Zero infra cost but ~5β15% failure rate on complex schemas; requires post-processing fallback. | |
response_format={"type": "json_object"} | API flag guaranteeing valid JSON syntax but NOT schema conformance. First introduced by OpenAI November 2023. Fields may be missing or mistyped. | |
response_format={"type": "json_schema", "json_schema": {...}} | API-level guarantee that output matches a specific JSON Schema. Enabled by constrained decoding on the server; ~100% syntactic conformance. Phase 3 era (Aug 2024+). |