Vector Databases Explained (Without the Hype)
Vector databases went from obscure to unavoidable the moment AI apps got popular. The marketing around them is thick, but the underlying idea is genuinely simple. This post explains what a vector database actually does, and — just as important — when you don't need one.
Start with embeddings
To understand vector databases, you first need embeddings. An embedding is a list of numbers (a "vector") that represents the meaning of a piece of text, produced by an embedding model. The key property: things with similar meaning get similar vectors.
"How do I get a refund?" and "I want my money back" use different words but mean nearly the same thing — so their vectors sit close together. "How's the weather?" means something else entirely, so its vector is far away. Embeddings turn meaning into geometry.
What a vector database does
Once your text is a bunch of vectors, the useful question becomes: given a new vector, which stored vectors are closest to it? That's called similarity search, and it's the one thing a vector database is built to do well.
You could compare a query against every stored vector one by one, but that's too slow once you have millions of them. Vector databases use specialized indexes (approximate nearest-neighbour algorithms) to find the closest matches in milliseconds instead of seconds. That speed at scale is their whole reason to exist.
Where you'll actually use it
Vector search underlies several now-common features:
- Semantic search — finding results by meaning, not exact keywords.
- Retrieval-Augmented Generation (RAG) — fetching the most relevant documents to feed an LLM so it answers grounded in your data.
- Recommendations — "items similar to this one."
- Deduplication and clustering — grouping things that mean the same thing.
If your app needs to find content by meaning, a vector database is probably in your future.
Do you actually need a dedicated one?
Here's the part the marketing skips: you often don't need a standalone vector database.
- Small dataset (up to a few thousand items)? You can keep the vectors in memory and compare them directly. No database required. This is faster to build and perfectly fast to run at that scale.
- Already using a general database? Many mainstream databases now offer vector-search extensions. If you have one, adding vector search to it is often simpler than running a whole new system.
- Large scale, high traffic, millions of vectors? Now a dedicated vector database earns its keep with performance, filtering, and operational features built for the job.
Reaching for a specialized database on day one, for a few hundred documents, is a classic case of over-engineering.
Things that matter in practice
- Use the same embedding model everywhere. Vectors from different models aren't comparable. If you change models, you must re-embed everything.
- Store metadata alongside vectors. You'll want to filter ("only this user's documents") and to show where a result came from.
- Mind the cost of embedding. Generating embeddings for a large corpus costs time and money — budget for it, and avoid re-embedding unchanged content.
- Combine with keyword search when it helps. Pure semantic search can miss exact terms like product codes; a hybrid of keyword + vector search is often the most robust.
Summary
A vector database is a tool for one specific job: finding the items whose meaning is closest to a query, fast, at scale. It's powered by embeddings, which turn meaning into geometry. It's essential for semantic search and RAG at real scale — but for small datasets, in-memory search or a vector extension on your existing database is often the smarter, simpler choice. Match the tool to your scale, not to the hype.