Semantic search is a data retrieval technique that focuses on understanding the contextual meaning and intent behind user queries rather than relying solely on exact keyword matching. It operates within the broader fields of natural language processing (NLP), information retrieval, and AI-powered search systems, becoming foundational to modern applications like Retrieval-Augmented Generation (RAG), recommendation engines, and enterprise knowledge bases. Unlike traditional lexical search (BM25, TF-IDF) which matches literal terms, semantic search maps queries and documents into high-dimensional vector embeddings that capture semantic relationships—enabling it to find "laptop for programming" when a user searches "computer for coding." A critical insight: hybrid approaches combining semantic and lexical signals nearly always outperform either method alone in production systems, as they balance semantic understanding with exact term matching.
What This Cheat Sheet Covers
This topic spans 15 focused tables and 102 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Retrieval Approaches
| Method | Example | Description |
|---|---|---|
query_emb = model.encode("laptop")results = search(query_emb, index) | • Uses neural embeddings to represent text as continuous vectors • captures semantic meaning rather than surface-level patterns. | |
scores = bm25.get_scores(query_tokens) | • Uses exact term matching with statistical weighting • generates high-dimensional sparse vectors where most values are zero. | |
results = alpha*dense + (1-alpha)*sparse | • Combines dense and sparse retrieval into a single ranked list • balances semantic understanding with keyword precision. | |
q_emb = encoder(query)d_emb = encoder(doc)sim = cosine(q_emb, d_emb) | • Encodes query and document independently then computes similarity • fast at scale but less accurate than cross-encoders. |