The Anthropic API provides programmatic access to Claude, Anthropic's family of advanced large language models, enabling developers to integrate state-of-the-art AI capabilities into applications. Available through direct API access, AWS Bedrock, Google Vertex AI, and Microsoft Foundry, the API supports text generation, vision, function calling, streaming responses, server-side tools, and agentic workflows. As of April 2026, Claude Opus 4.7 is the flagship model with a 1M token context window, adaptive thinking, and step-change gains in agentic coding; the 1M context window is now standard pricing with no surcharge across all Claude 4.x models. The API follows a stateless message-based architecture, and new offerings like Claude Managed Agents (public beta April 2026) provide full hosted infrastructure for production agent deployments at $0.08 per session-hour.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 112 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Claude Models
Picking the right model is the first decision in any Claude integration — it sets your cost ceiling, speed, and capability all at once. This lineup runs from the flagship Opus tier through the balanced Sonnet workhorse to fast, cheap Haiku, with the deprecating previous-generation models flagged so you know what to migrate off. Model ID string, context window, output cap, and per-million-token price all sit side by side.
| Model | Example | Description |
|---|---|---|
model: "claude-opus-4-7"1M context, 128K output | • Current flagship GA model (released April 16, 2026) • step-change in agentic coding, vision, and knowledge work • adaptive thinking only (no manual budget_tokens) • temperature/top_p/top_k removed (400 error if set) • new tokenizer uses up to 35% more tokens than prior models • 5/25 per million input/output tokens. | |
model: "claude-sonnet-4-6"1M context, 64K output | • Best speed/intelligence balance; 1M context GA as of March 13, 2026 • 30–50% faster than Sonnet 4.5 • supports adaptive thinking and extended thinking • 3/15 per million tokens• default for most production use cases. | |
model: "claude-haiku-4-5"200K context, 64K output | • Fastest and most cost-effective at 1/5 per million tokens• supports extended thinking • no prompt injection protection — review before agentic use • ideal for high-volume, low-latency tasks. | |
model: "claude-opus-4-6"1M context, 128K output | • Previous flagship; still available at same price ( 5/25)• supports adaptive + manual extended thinking • fast mode beta available (6× price) • superseded by Opus 4.7 for most use cases. |