Skip to content
← All writing
June 27, 2026·4 min read

Fine-Tuning vs RAG vs Prompting: How to Add Knowledge to an LLM

When a language model doesn't behave the way your app needs, there are three levers you can pull: prompting, retrieval (RAG), and fine-tuning. They're often discussed as competitors, but they solve different problems, and the right answer is frequently a combination. This post lays out what each one actually does so you can choose deliberately.

First, diagnose the problem

Before picking a technique, figure out what kind of gap you're facing:

  • Knowledge gap — the model doesn't know something (your docs, your data, recent events).
  • Behaviour gap — the model knows enough but doesn't act the way you want (wrong tone, format, or style).
  • Skill gap — the model can't reliably perform a specialized task no matter how you ask.

The technique you choose should match the gap. Using fine-tuning to fix a knowledge gap, or RAG to fix a tone problem, is a common and expensive mistake.

Prompting: cheapest, start here

Prompting means shaping behaviour purely through instructions and examples in the prompt itself. No infrastructure, no training — just words.

  • Fixes: behaviour gaps (tone, format, structure) and small knowledge gaps you can simply paste in.
  • Strengths: instant, free to change, no data pipeline. You can iterate in minutes.
  • Limits: everything must fit in the context window, and you pay for those tokens on every call. It can't teach genuinely new skills.

Rule of thumb: always try prompting first. A surprising share of "we need to fine-tune" situations are solved by a better prompt with a couple of examples.

RAG: for knowledge that changes

Retrieval-Augmented Generation fetches relevant information at question time and puts it in the prompt, so the model answers grounded in your data.

  • Fixes: knowledge gaps — especially large or frequently-changing knowledge.
  • Strengths: always up to date (change the data, not the model), can cite sources, and keeps facts out of the model's weights where they'd go stale.
  • Limits: adds a retrieval pipeline (chunking, embeddings, a vector store) and its quality depends on retrieving the right context. It doesn't change the model's behaviour or skills.

Rule of thumb: if the problem is "the model doesn't know about our stuff," RAG is almost always the answer — not fine-tuning.

Fine-tuning: for behaviour and skill at scale

Fine-tuning continues training the model on your examples, adjusting its weights so the new behaviour becomes built in.

  • Fixes: behaviour and skill gaps — a consistent style, a specialized format, or a narrow task the base model handles poorly.
  • Strengths: bakes the behaviour in, so you don't spend prompt tokens re-explaining it every call; can outperform prompting on very specific, repetitive tasks.
  • Limits: needs a quality dataset (often hundreds of examples), costs time and money, must be redone when you want changes, and — crucially — it is poor at adding knowledge. Facts trained in are hard to update and can still be recalled incorrectly.

Rule of thumb: reach for fine-tuning only after prompting and RAG fall short, and only for stable behaviours or skills — never as a way to store changing facts.

They combine

These aren't mutually exclusive; strong systems layer them:

  • RAG + prompting: retrieve the facts, and use a well-crafted prompt to control how they're presented. This covers the vast majority of "chat with our data" products.
  • Fine-tuning + RAG: fine-tune the model for your task's style and format, then use RAG to feed it current facts. Behaviour from fine-tuning, knowledge from retrieval — each doing what it's best at.

A quick decision guide

  • Need a different tone or output format? → Prompting (then fine-tuning if it must be perfect and constant).
  • Need the model to know your documents/data? → RAG.
  • Need a consistent specialized skill the base model can't do, with data to teach it? → Fine-tuning.
  • Not sure? → Start with prompting, add RAG for knowledge, and only fine-tune if a real gap remains.

Summary

Prompting, RAG, and fine-tuning aren't ranked from worst to best — they address different gaps. Prompting shapes behaviour instantly and cheaply; RAG supplies changing knowledge with grounding and citations; fine-tuning bakes in stable behaviours and skills but is the wrong tool for facts. Diagnose the gap first, start cheap, and combine techniques so each one does the job it's actually good at.