The OpenAI API provides programmatic access to cutting-edge large language models (LLMs) including GPT-5.5, GPT-5.4, and specialized models for text generation, embeddings, image creation, speech, moderation, and video generation via the Sora API. Launched as a RESTful API with official SDKs for Python, Node.js, .NET, and other languages, it enables developers to integrate AI capabilities via simple HTTP requests or the Agents SDK for multi-agent workflows. Unlike the ChatGPT web interface, the API offers token-based pricing, fine-grained control over model parameters, and access to the Responses API — the recommended interface for all new applications since March 2025, offering built-in tools for web search, computer use, hosted shell, and MCP. Key to effective API usage in 2026 is understanding reasoning effort (the reasoning.effort parameter controlling the intelligence/cost tradeoff), extended prompt caching (up to 24 hours for eligible models, reducing costs by up to 90%), and the Assistants API sunset scheduled for August 26, 2026.
What This Cheat Sheet Covers
This topic spans 22 focused tables and 194 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: API Endpoints
Every OpenAI capability lives at its own REST or WebSocket endpoint, and picking the right one shapes cost, latency, and how much state you manage. Start new text apps on the Responses API; reach for the specialized endpoints (embeddings, audio, images, video, moderation, batch) when the task matches their shape.
| Endpoint | Example | Description |
|---|---|---|
POST https://api.openai.com/v1/responses | • Responses API — recommended interface for all new applications • built-in tools: web_search, file_search, code_interpreter, computer_use, shell, mcp, image_generation, tool_search, skills• stateful: carry conversation context with previous_response_id instead of resending history• best performance for reasoning models. | |
POST https://api.openai.com/v1/chat/completions | • Chat Completions API — industry-standard endpoint for conversational responses • stateless: you resend the full messages array each call• supports streaming, function calling, vision, structured outputs • supported indefinitely but Responses offers better reasoning-model performance. | |
POST https://api.openai.com/v1/embeddings | • Embeddings API — converts text into a numeric vector (1536 dims for text-embedding-3-small, 3072 for text-embedding-3-large)• returns the vector only; you compute similarity (e.g. cosine) and store vectors yourself • powers semantic search, clustering, recommendations, and RAG. | |
POST https://api.openai.com/v1/images/generations | • Image Generation API — creates a new image from a text prompt (GPT Image models) • token-based pricing • use the separate POST /v1/images/edits to modify an existing image with a prompt or mask. | |
POST https://api.openai.com/v1/videos | • Sora Video API — asynchronous text-to-video generation • the call returns a job id with a queued/in_progress status, not the file• poll GET /v1/videos/{id} or register a webhook, then download the MP4 from GET /v1/videos/{id}/content once completed. | |
POST https://api.openai.com/v1/audio/transcriptions | • Speech-to-Text API — converts spoken audio into written text • returns a full transcript (optionally with timestamps / word-level timing), not a summary • supports many languages. |