A vector database is a specialized database designed to store, index, and search high-dimensional vectors (embeddings) efficiently. While traditional databases search by exact matches or ranges (find all customers in Texas), vector databases search by similarity (find the 10 documents most similar to this query). They're a critical piece of modern AI infrastructure.
Why vector databases exist: When you embed millions of documents as vectors with 1,536 dimensions each, you need a way to quickly find the most similar vectors to a query. Naive comparison (checking every vector against the query) is too slow at scale. Vector databases use specialized indexing algorithms to make similarity search fast — typically returning results in milliseconds even across millions of vectors.
How they work:
- You generate embeddings for your data (text, images, etc.) using an embedding model
- Store these vectors in the database along with metadata (original text, source, date, etc.)
- When a user queries, embed the query using the same model
- The database finds the nearest vectors using approximate nearest neighbor (ANN) algorithms
- Return the most similar items with their metadata
Popular vector databases:
- Pinecone: Fully managed, serverless. Easy to start, scales automatically. $0.096/hour per pod.
- Weaviate: Open source with managed cloud option. Supports hybrid search (vector + keyword).
- Chroma: Lightweight, great for prototyping. Runs in-memory or with persistence.
- Milvus: Open source, enterprise-grade. Handles billions of vectors.
- pgvector: PostgreSQL extension. Use your existing Postgres setup for vector search. Good for smaller datasets (under 10 million vectors).
- Qdrant: Open source, Rust-based, very fast. Growing in popularity.
The primary use case — RAG: Retrieval-Augmented Generation is the killer app for vector databases. The pattern: embed your company's documents in a vector database, embed user questions at query time, retrieve the most relevant documents, and feed them to a language model as context. This is how enterprise AI assistants answer questions about internal knowledge without fine-tuning.
Other use cases: Recommendation engines, image similarity search, duplicate detection, anomaly detection, and semantic caching (storing and reusing similar AI responses).
Choosing a vector database:
- Under 100K vectors: pgvector or Chroma. Keep it simple.
- 100K-10M vectors: Pinecone, Weaviate, or Qdrant. Balance of features and performance.
- Over 10M vectors: Milvus or Pinecone Enterprise. Purpose-built for scale.
Key metrics: Query latency (how fast results return), recall (how many true similar items it finds), index build time, and cost per million vectors stored.
Vector databases are not replacing traditional databases — they complement them. Most production systems use both: a vector database for semantic search and a traditional database for structured data and transactions.