Embeddings Cheat Sheet

Updated 2026-04-27

Next Topic: Few-Shot and Zero-Shot Learning Cheat Sheet

Embeddings are dense vector representations that map discrete data (text, images, code, audio, graphs) into continuous high-dimensional spaces where semantic similarity corresponds to geometric proximity. They power modern AI applications including search, retrieval-augmented generation (RAG), recommendation systems, and classification. Unlike sparse representations that encode presence/absence, embeddings capture nuanced meaning and relationships through learned patterns, enabling machines to compare, cluster, and reason about complex data using distance metrics. The field has rapidly shifted from static word embeddings toward instruction-aware decoder-only LLMs achieving SOTA results on the MTEB leaderboard.

What This Cheat Sheet Covers

This topic spans 24 focused tables and 129 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Embedding ConceptsTable 2: Word Embedding TechniquesTable 3: Transformer-Based Sentence EmbeddingsTable 4: Cloud Embedding APIsTable 5: Document and Paragraph EmbeddingsTable 6: Image EmbeddingsTable 7: Multimodal EmbeddingsTable 8: Audio EmbeddingsTable 9: Code EmbeddingsTable 10: Graph and Knowledge EmbeddingsTable 11: Distance Metrics and SimilarityTable 12: Dimensionality Reduction for VisualizationTable 13: Embedding Normalization and PreprocessingTable 14: Advanced Embedding ArchitecturesTable 15: Positional EmbeddingsTable 16: Vector DatabasesTable 17: Embedding Quantization and CompressionTable 18: Sparse and Hybrid EmbeddingsTable 19: Fine-Tuning and AdaptationTable 20: Embedding EvaluationTable 21: Training ObjectivesTable 22: Embedding Drift and MonitoringTable 23: Production Best PracticesTable 24: ANN Index Types for Vector Search

Table 1: Core Embedding Concepts

Start here for the vocabulary everything else builds on — what a dense vector actually is, why nearby points mean related meanings, and how dimension count trades nuance against cost. The crucial split to internalize is static versus contextual: whether a word like "bank" always gets the same vector or one shaped by the sentence around it.

Concept	Example	Description
Dense vector representation	`[0.12, -0.45, 0.89, ..., 0.33]` (768-D)	• Maps discrete tokens/objects into continuous numerical vectors where each dimension encodes learned semantic features • typical sizes range from 128 to 4096 dimensions.
Semantic similarity	`similarity("dog", "puppy") > similarity("dog", "car")`	Embeddings encode meaning such that semantically related concepts cluster together in vector space, enabling similarity search via distance calculation.
Embedding dimension	`text-embedding-3-small: 1536-D` `text-embedding-3-large: 3072-D`	• Number of values in each vector • higher dimensions capture more nuanced distinctions but increase memory and compute cost.
Embedding space	Learned 768-D manifold preserving semantic structure	• High-dimensional geometric space where embeddings live • distance and direction encode relationships between data points.

Table 1: Core Embedding Concepts

Concept	Example	Description
Dense vector representation	`[0.12, -0.45, 0.89, ..., 0.33]` (768-D)	• Maps discrete tokens/objects into continuous numerical vectors where each dimension encodes learned semantic features • typical sizes range from 128 to 4096 dimensions.
Semantic similarity	`similarity("dog", "puppy") > similarity("dog", "car")`	Embeddings encode meaning such that semantically related concepts cluster together in vector space, enabling similarity search via distance calculation.
Embedding dimension	`text-embedding-3-small: 1536-D` `text-embedding-3-large: 3072-D`	• Number of values in each vector • higher dimensions capture more nuanced distinctions but increase memory and compute cost.
Embedding space	Learned 768-D manifold preserving semantic structure	• High-dimensional geometric space where embeddings live • distance and direction encode relationships between data points.