The OpenAI API provides programmatic access to cutting-edge large language models (LLMs) including GPT-4, GPT-5, and specialized models for text generation, embeddings, image creation, speech processing, and moderation. Launched as a RESTful API with official SDKs for Python, Node.js, and other languages, it enables developers to integrate AI capabilities into applications via simple HTTP requests. Unlike the ChatGPT web interface (a consumer product), the API offers token-based pricing, fine-grained control over model parameters, and no rate limits for paid tiers — making it the foundation for production AI applications ranging from chatbots to code generation tools. In 2026, OpenAI introduced the Responses API as the successor to Chat Completions, offering better performance with reasoning models and streamlined tool usage, while deprecating the older Assistants API (shutdown scheduled for August 26, 2026). Key to effective API usage is understanding prompt caching (which can reduce costs by up to 90%), structured outputs for guaranteed JSON schema adherence, and function calling for connecting LLMs to external tools and databases.
What This Cheat Sheet Covers
This topic spans 20 focused tables and 150 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: API Endpoints
| Endpoint | Example | Description |
|---|---|---|
POST https://api.openai.com/v1/responses | • Responses API — successor to Chat Completions offering 3% better intelligence with reasoning models, native tool chaining, session management, and built-in state tracking • recommended for all new applications as of March 2025. | |
POST https://api.openai.com/v1/chat/completions | • Chat Completions API — industry-standard endpoint for generating conversational responses • supports streaming, function calling, and vision (multimodal inputs) • will remain supported indefinitely but Responses API offers better performance. | |
POST https://api.openai.com/v1/completions | • Completions API (legacy) — older endpoint for text completion without chat format • received final update in July 2023 • use Chat Completions or Responses instead for new projects. | |
POST https://api.openai.com/v1/embeddings | • Embeddings API — converts text into numerical vector representations (1536 dimensions for text-embedding-3-small, 3072 for text-embedding-3-large)• used for semantic search, clustering, recommendations, and RAG applications. | |
POST https://api.openai.com/v1/images/generations | • Image Generation API — creates images from text prompts using GPT Image 1.5 (successor to DALL-E 3 which deprecates May 12, 2026) • supports 1024×1024, 1792×1024, 1024×1792 sizes in standard or HD quality. | |
POST https://api.openai.com/v1/audio/transcriptions | • Whisper Speech-to-Text API — transcribes audio files into text in the source language • supports 25+ languages, automatic language detection, and timestamps • pricing $0.006/minute • max file size 25MB. |