Grounding with context: docs, examples, tool outputs
Feed the model the right facts at the right time.
Hallucinations aren't fixed by asking the model to be more careful. They're fixed by giving the model the facts before it tries to generate them. That's grounding.
Three levels of grounding
- Static grounding. Facts embedded directly in the prompt. Good for stable, small fact sets (brand guidelines, product specs, common policies).
- Dynamic grounding (RAG). Facts retrieved per-query from a vector DB or search index. Good for large or frequently-changing knowledge (docs, support articles, internal wikis).
- Tool-call grounding. Model calls a function to fetch facts on demand (database lookup, API call). Good for exact-answer data (account balances, inventory counts).
Most production systems combine all three.
Static grounding: how to do it well
- Put facts before the task. Model sees context first, task second.
- Separate clearly. Use
<docs>,<context>, or markdown sections. The model is better at "refer to the context" when context is visibly delimited. - Don't pad. Irrelevant context degrades quality (middle-forgetting effect; attention dilution). If a fact isn't useful for this query, leave it out.
Dynamic grounding (RAG) essentials
RAG is a whole discipline — the important 20% for an intro:
- Embeddings matter more than the LLM. Bad retrieval → bad answers. The embedding model decides what "relevant" means.
- Chunking affects everything. Chunk size, overlap, and whether you chunk by fixed length or semantic boundary (paragraph, section) — all impact quality. Default to semantic chunks at paragraph or section level.
- Top-K is a calibration choice. Retrieve too few — miss relevant context. Retrieve too many — dilute the prompt. Start at 5.
- Re-rank. After initial retrieval, run a cross-encoder re-ranker to pick the best subset. Easy +10-20% relevance.
- Log what got retrieved. When an answer is wrong, you need to see whether retrieval failed or the model failed. Can't debug what you don't log.
Tool-call grounding
For any fact that has an authoritative answer somewhere, don't ask the model to remember it. Give it a tool:
- "What's my balance?" →
get_account_balance(user_id) - "Which region is this order shipping to?" →
get_order_details(order_id) - "Current stock price?" →
get_quote(ticker)
Tool calls have failure modes too (wrong function, wrong arguments, silent fallbacks). But they eliminate a category of hallucination entirely — and make audit trails trivial.
The "refusal when ungrounded" pattern
Add an explicit rule: "If the answer isn't in the provided context, say you don't know. Never guess."
This doesn't work perfectly — models still occasionally make things up — but it's a strong signal that shifts the distribution toward honest refusals.
What kills grounding
- Stuffing too much context. The "lost in the middle" effect eats grounding quality.
- Conflicting context. Two docs with contradictory facts. Model picks one; you don't know which. Deduplicate upstream.
- Forgetting to freshen. RAG indexes go stale. Schedule re-indexing.
Check your understanding
2-question self-check
Optional. Your answers feed your knowledge score on the track certificate.
Q1.Which grounding approach is best when you need an exact-answer fact (like an account balance)?
Q2.What's the single most valuable debugging action for a RAG-powered system?
Continue in this track
More lessons from Prompt Engineering Mastery.
Lesson 6
Debugging a prompt that won't behave
A systematic process for diagnosing prompt failures.
Lesson 7
Role prompting and persona engineering
Make the model a specific kind of helper without losing reliability.
Lesson 9
Prompt injection and how to defend against it
What prompt injection is, why it's hard, and what actually works.