Google Gemini is a family of multimodal large language models by Google DeepMind, capable of understanding and generating text, images, audio, video, and code. As of April 2026, the production-stable models are Gemini 2.5 Pro and Gemini 2.5 Flash (1M token context windows), while Gemini 3.1 Pro Preview (2M token context) leads the preview generation with state-of-the-art reasoning. The API is accessible via two paths: the Gemini Developer API (API keys, Google AI Studio) and Vertex AI (enterprise GCP deployment with IAM). The official Python SDK is google-genai (pip install google-genai), using a unified Client-based API for both deployment paths.
What This Cheat Sheet Covers
This topic spans 16 focused tables and 111 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Model Variants
| Model | Example | Description |
|---|---|---|
model='gemini-2.5-flash' | • Most widely used stable model — best balance of speed, capability, and cost • 1M token context, 65K max output tokens; supports thinking mode via thinking_budget• free tier available; shuts down June 17, 2026. | |
model='gemini-2.5-pro' | • Stable flagship for complex reasoning, coding, and multimodal tasks • 1M token context, 65K max output tokens; adaptive thinking, implicit context caching • shuts down June 17, 2026. | |
model='gemini-3.1-pro-preview' | • Latest-generation flagship preview (released Feb 19, 2026) • 2M token context window — double previous models • thinking_level enum: LOW, MEDIUM, HIGH• no free tier; production-ready preview. | |
model='gemini-3-flash-preview' | • Speed + intelligence preview (released Dec 17, 2025) • Agentic Vision: zooms, inspects, and manipulates images via code execution • thinking_level for latency control; free tier available. |