How models are trained (and why it matters to you)
Pre-training, instruction tuning, alignment — and what each one means for your choices.
Picking a model without understanding how it was made is like buying a car based on the color. Three stages of training — pre-training, instruction tuning, alignment — each shape a different dial.
Act 1 — Pre-training
The base model ingests a huge corpus (public internet text, books, code, often trillions of tokens) and learns next-token prediction. At the end of pre-training you have a model that can continue text fluently but doesn't necessarily follow instructions. It'll happily autocomplete your input like it's a Wikipedia article.
This is where most of the capability lives. It's also where nearly all the compute cost lives — on the order of $10M-$100M+ for frontier runs.
Act 2 — Instruction tuning (SFT)
Humans (or other models) show the base model pairs of instruction + ideal response. The model learns to respond to instructions instead of just continuing text. This is what turns "fancy autocomplete" into "helpful assistant."
Orders of magnitude cheaper than pre-training. Mostly about tone and format, not raw intelligence.
Act 3 — Alignment (RLHF, Constitutional AI, DPO)
Further tuning — typically reinforcement learning from human feedback — shapes the model to prefer helpful, honest, harmless responses. This is where a model's personality is sculpted: how cautious it is, how it refuses, how much it pushes back.
Why you should care
| Decision | Which stage drives it |
|---|---|
| Raw capability (hard reasoning, long context) | Pre-training |
| Tone and refusal behavior | Alignment |
| Custom domain behavior (fine-tuning on your data) | Acts 2 + 3 — you're not rewriting Act 1 |
| Cost per token at inference | Pre-training decisions (size, architecture) |
When a new model ships, you're buying a bundle of all three. Benchmarks can move in opposite directions depending on which act changed.
What this predicts
If a new model feels smarter but refuses more things, probably the alignment got tighter. If it's suddenly better at code, likely they baked in more code pre-training data. If it's "chattier," the SFT corpus expanded. Being able to attribute behavior changes to training decisions is half of vendor evaluation.
Check your understanding
3-question self-check
Optional. Your answers feed your knowledge score on the track certificate.
Q1.Which training phase accounts for the majority of an LLM's raw capability?
Q2.If a model suddenly refuses more requests than before, which training layer most likely changed?
Q3.When you fine-tune a foundation model on your own data, what are you mostly affecting?
Continue in this track
More lessons from AI Fundamentals.
Lesson 1
What is a large language model, really?
Strip the hype. Learn what an LLM actually does, token by token.
Lesson 3
Tokens, context windows, and why your prompts get cut off
The mechanics of context — and how to reason about fit, cost, and truncation.
Lesson 4
Your first useful prompt
Walk through structuring a prompt that gets consistent, production-quality output.