The OpenAI API provides programmatic access to state-of-the-art language models (GPT-5.5, GPT-5.4, GPT-5.2, o3, o4-mini), image generation (gpt-image-2, gpt-image-1.5), speech-to-text (gpt-4o-transcribe), text-to-speech, embeddings, and video generation. The 2025–2026 landscape is dominated by the Responses API as the primary interface for agentic workflows, with built-in tools for web search, code interpreter, file search, computer use, hosted shell, skills, and Model Context Protocol (MCP) integrations. Key 2026 additions include GPT-5.5 (flagship, April 2026), the GPT-5.4 family (March 2026), gpt-image-2 (April 2026), Realtime 2 with speech translation (May 2026), and Compaction for long-running agent contexts. ⚠️ Fine-tuning is being wound down (new users blocked May 2026; existing users until Jan 2027), and the Assistants API sunsets August 26, 2026 — migrate to the Responses API and Conversations API before those dates.
What This Cheat Sheet Covers
This topic spans 18 focused tables and 183 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core API Endpoints
The Responses API is OpenAI's primary interface for building agents and complex workflows; Chat Completions remains recommended for simple stateless apps. Knowing which endpoint to use — and the capabilities each one unlocks — is the first decision in any integration.
| Endpoint | Example | Description |
|---|---|---|
client.responses.create(model="gpt-5.5", input=[...]) | • Primary interface for stateful agentic conversations • built-in tools: web_search, file_search, code_interpreter, computer_use, hosted_shell, skills, apply_patch • supports MCP, Compaction, Conversations API, WebSocket mode • supersedes Assistants API (sunset Aug 26, 2026). | |
client.chat.completions.create(model="gpt-5.5", messages=[...]) | • Stateless endpoint for conversational AI • pass message history explicitly each request • supports JSON mode, vision, function calling, streaming • still recommended for simple stateless apps; reasoning models get better performance via Responses API. | |
client.conversations.create(model="gpt-5.5"); client.conversations.turns.create(...) | • Manages long-running stateful conversations within the Responses API • OpenAI handles thread state server-side • replaces Assistants Threads; provides side-by-side migration from Assistants API • released August 2025. | |
client.embeddings.create(model="text-embedding-3-small", input="text") | • Converts text to dense vector representations for semantic search, clustering, RAG • returns 1536-dim (small) or 3072-dim (large) vectors • supports dimensions param for vector compression. | |
client.audio.transcriptions.create(model="gpt-4o-transcribe", file=audio) | • Transcribes audio to text • gpt-4o-transcribe (higher accuracy) or gpt-4o-mini-transcribe (fast, low cost) • supports 98 languages, background noise, diverse accents • translations endpoint converts non-English audio to English. | |
client.audio.speech.create(model="gpt-4o-mini-tts", voice="alloy", input="text") | • Generates spoken audio from text • gpt-4o-mini-tts (steerable style/emotion) or tts-1/tts-1-hd (legacy) • 6 voices: alloy, echo, fable, onyx, nova, shimmer • formats: MP3, Opus, AAC, FLAC. |