Vector Databases Explained (Without the Hype)

Vector databases went from obscure to unavoidable the moment AI apps got popular. Venture money poured in, every database vendor bolted the word "vector" onto its homepage, and somewhere along the way the underlying idea — which is genuinely simple — got buried under the marketing. This post explains what a vector database actually does, how the pieces fit, and — just as important — when you don't need one.

Start with embeddings

To understand vector databases, you first need embeddings. An embedding is a list of numbers (a "vector") that represents the meaning of a piece of text, produced by an embedding model. Depending on the model, that list might be a few hundred to a few thousand numbers long. The key property: things with similar meaning get similar vectors.

"How do I get a refund?" and "I want my money back" use different words but mean nearly the same thing — so their vectors sit close together. "How's the weather?" means something else entirely, so its vector is far away. Embeddings turn meaning into geometry: closeness in this high-dimensional space corresponds to closeness in meaning.

"Close" is measured with simple math — usually cosine similarity, the angle between two vectors. You don't need to understand the geometry deeply to use it, but it helps to know that the entire edifice rests on one operation: compare two lists of numbers, get a similarity score.

It's not just text, either. There are embedding models for images, audio, and code, and multimodal models that put text and images into the same space — which is how "search my photos for beach sunsets" works without any manual tagging.

What a vector database does

Once your content is a pile of vectors, the useful question becomes: given a new vector, which stored vectors are closest to it? That's called similarity search (or nearest-neighbour search), and it's the one thing a vector database is built to do well.

You could compare a query against every stored vector one by one — a brute-force scan — and for small collections that's actually fine. But it scales linearly, and once you have millions of vectors it's too slow for interactive use. Vector databases use specialized indexes — approximate nearest-neighbour algorithms like HNSW — that find the closest matches in milliseconds by accepting a tiny, usually irrelevant, chance of missing the exact best match. That speed at scale is the whole reason the category exists.

The word "approximate" scares people unnecessarily. In practice the recall of a well-tuned index is very high, and in a typical application (where the top-5 results all get handed to an LLM anyway) the difference between exact and approximate search is invisible.

Where you'll actually use it

Vector search underlies several now-common features:

Semantic search — finding results by meaning, not exact keywords.
Retrieval-Augmented Generation (RAG) — fetching the most relevant documents to feed an LLM so it answers grounded in your data. This is the use case driving most of the current interest.
Recommendations — "items similar to this one," computed from embedding closeness rather than hand-built rules.
Deduplication and clustering — grouping things that mean the same thing even when they're worded differently.

If your app needs to find content by meaning, vector search is probably in your future. Whether a vector database is — that's the next section.

Do you actually need a dedicated one?

Here's the part the marketing skips: you often don't need a standalone vector database.

Small dataset (up to a few thousand items)? Keep the vectors in memory and compare them directly — a brute-force loop over a few thousand vectors takes single-digit milliseconds. No database required, nothing new to operate, nothing new to pay for. I've shipped exactly this: the grounded-answers demo on this site embeds and searches entirely in the browser, no server at all.
Already running Postgres? The pgvector extension adds vector columns, similarity operators, and HNSW indexing to the database you already have. Your vectors live next to the rows they describe, they're covered by the same backups and transactions, and there's no second system to keep in sync. For most products this is the sweet spot, and equivalents exist across the ecosystem — SQLite has extensions, and the major search engines (Elasticsearch, OpenSearch) do vector search natively now.
Millions of vectors, high query volume, heavy filtering? Now a dedicated vector database (Qdrant, Milvus, Weaviate, Chroma, and friends) earns its keep, with index tuning, horizontal scaling, and operational features built specifically for the job.

Reaching for a specialized database on day one, for a few hundred documents, is a classic case of over-engineering — you take on an extra system to operate, a sync problem between it and your source of truth, and a bill, in exchange for solving a scale problem you don't have. Start simple; the migration path upward is well-trodden if you ever need it.

Things that matter in practice

Use the same embedding model everywhere. Vectors from different models aren't comparable — not even from different versions of the same model. If you change models, you must re-embed the entire corpus, so record which model produced each vector.
Store metadata alongside vectors. You'll want to filter results ("only this user's documents," "only published articles") and to show where a result came from. Metadata filtering combined with vector search is one of the features where dedicated databases and pgvector have matured a lot — but you have to put the metadata there to use it.
Mind the cost of embedding. Generating embeddings for a large corpus costs real time and money, and it's easy to accidentally do twice. Store a content hash per item and skip re-embedding anything unchanged.
Combine with keyword search when it helps. Pure semantic search misses exact terms — product codes, error strings, names. A hybrid of keyword + vector search, with merged results, is often the most robust setup for real content.
Don't skip access control. A vector index happily returns the nearest neighbours regardless of who's asking. If documents belong to different users or tenants, enforce that in the query filter, every time — it's the kind of leak that's silent until it's a headline.

Summary

A vector database is a tool for one specific job: finding the items whose meaning is closest to a query, fast, at scale. It's powered by embeddings, which turn meaning into geometry, and accelerated by approximate nearest-neighbour indexes, which make the search fast enough to sit in a request path. It's essential infrastructure for semantic search and RAG at real scale — but for small datasets, in-memory search or a vector extension on the database you already run is usually the smarter, simpler choice. Match the tool to your scale, not to the hype.