The OpenAI API provides programmatic access to state-of-the-art language models (GPT-5.2, GPT-5, GPT-4.1, GPT-4o, o-series), image generation (gpt-image-1), speech-to-text (Whisper, gpt-4o-transcribe), text-to-speech, embeddings, and fine-tuning capabilities. The 2025 landscape introduced the Responses API as the primary interface for agentic workflows, along with built-in tools for web search, code interpreter, file search, computer use, and Model Context Protocol (MCP) integrations. OpenAI also launched GPT-4.1 (1M context, April 2025), GPT-5 (Aug 2025), gpt-image-1 replacing DALL-E 3 (deprecated May 2026), and Reinforcement Fine-Tuning (RFT) for o4-mini. Understanding prompt engineering, structured outputs, function calling, and the token-based pricing model is essential for building reliable AI-powered applications.
What This Cheat Sheet Covers
This topic spans 18 focused tables and 160 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core API Endpoints
| Endpoint | Example | Description |
|---|---|---|
client.responses.create(model="gpt-5.2", input=[...]) | β’ Primary interface for stateful agentic conversations β’ built-in tools: web_search, file_search, code_interpreter, computer_use β’ supports MCP (Model Context Protocol) for external service integrations β’ supersedes Assistants API (sunset Aug 26, 2026). | |
client.chat.completions.create(model="gpt-4o", messages=[...]) | β’ Stateless endpoint for conversational AI β’ pass message history explicitly each request β’ supports JSON mode, vision, function calling, streaming β’ widely compatible; still recommended for simple stateless apps. | |
client.embeddings.create(model="text-embedding-3-small", input="text") | β’ Converts text to dense vector representations for semantic search, clustering, RAG β’ returns 1536-dim (small) or 3072-dim (large) vectors β’ supports dimensions param for vector compression. | |
client.audio.transcriptions.create(model="gpt-4o-transcribe", file=audio) | β’ Transcribes audio to text β’ gpt-4o-transcribe (new, higher accuracy) or whisper-1 (legacy) β’ supports 98 languages, background noise, diverse accents β’ translations endpoint converts to English. | |
client.audio.speech.create(model="gpt-4o-mini-tts", voice="alloy", input="text") | β’ Generates spoken audio from text β’ gpt-4o-mini-tts (new, steerable style/emotion) or tts-1/tts-1-hd (legacy) β’ 6 voices: alloy, echo, fable, onyx, nova, shimmer β’ formats: MP3, Opus, AAC, FLAC. |