pgvector is an open-source PostgreSQL extension that transforms PostgreSQL into a vector database by adding native support for storing, indexing, and querying high-dimensional vector embeddings. Unlike standalone vector databases, pgvector keeps vectors alongside your existing relational data, enabling you to build semantic search, recommendations, and RAG (Retrieval Augmented Generation) systems without introducing a separate database layer. The extension supports approximate nearest neighbor (ANN) search through two index types (IVFFlat and HNSW), three distance metrics (L2, cosine, inner product), and scales from thousands to tens of millions of vectors depending on hardware and tuning. The key mental model: pgvector makes vector similarity search feel like a native PostgreSQL feature—vectors are just another column type, and similarity queries use familiar SQL with specialized operators like <-> for distance.
What This Cheat Sheet Covers
This topic spans 18 focused tables and 134 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Installation and Setup
| Method | Example | Description |
|---|---|---|
CREATE EXTENSION vector; | Enables pgvector in current database after installation; must run once per database, requires superuser or database owner privileges. | |
docker run -d -p 5432:5432 pgvector/pgvector:pg17 | Official Docker images with pgvector pre-installed; available for PostgreSQL 12-18, simplest way to start locally. | |
brew install pgvector | macOS installation via Homebrew; automatically compiles for your PostgreSQL version. | |
sudo apt install postgresql-17-pgvector | Debian/Ubuntu package installation; version number must match PostgreSQL major version. |