Multi-agent systems without the chaos
When multiple agents help, when they don't, and how to coordinate them.
"Add more agents" is a seductive answer that usually makes things worse. Multi-agent architectures are powerful when they fit, a disaster when they don't. Here's when to use them.
When multi-agent genuinely helps
- Clear specialization. One agent is genuinely different from another (different tools, different expertise, different permissions).
- Parallel work on independent subproblems. Research a topic by having three agents research different angles in parallel.
- Hierarchical decomposition with narrow interfaces. A manager agent delegates to workers that return structured results.
When multi-agent hurts
- "It sounds more impressive." Bad reason. One good prompt beats three agents arguing.
- "More agents = more intelligence." No. More agents means more coordination overhead, more points of failure, higher latency, higher cost.
- When a workflow would suffice. A deterministic workflow with one LLM per step is often better than an agent-of-agents.
The coordination tax
Every additional agent adds:
- Communication overhead (structured messages between agents).
- Synchronization problems (who waits for what).
- Error propagation (one agent fails, others cascade).
- Debug difficulty (now you need traces per agent + a joint trace).
For a 3-agent system, expect 2-3× the engineering cost of a 1-agent system with equivalent capability.
Common architectures
- Manager-worker. One agent plans and delegates; workers execute specific subtasks and report back. Clean, common.
- Peer-to-peer specialists. Agents with distinct roles collaborate (writer + editor, researcher + analyst). Works if roles are genuinely distinct.
- Debate / red-team. Two agents take opposing stances; a third judges. Slow, expensive, sometimes produces better decisions.
The interface problem
Agents talk via text. Loose interfaces mean drift and confusion. Design tight interfaces:
- Structured message schemas (not free-form natural language).
- Explicit "I'm done, here are the results" signals.
- Bounded loops (manager gives up after N rounds of back-and-forth).
Without these, your system degenerates into "agents chatting forever."
Memory and context across agents
- Shared scratchpad — all agents see a common workspace. Easy coordination, expensive in tokens.
- Passed context — manager passes relevant info to each worker, workers return summaries. Better scaling.
- Persistent memory — agents read/write to a shared store (vector DB, structured state). Useful for long-running systems.
The "simpler system first" test
Before shipping a multi-agent architecture, build the single-agent version and measure its failure rate. If a single agent gets 90% right and you need 95%, multi-agent is probably not the right lever — better prompts, better tools, or better evals usually close the gap faster.
If the single agent gets 40% and you need 95%, the gap is big enough that architecture changes might be justified.
What works in production
Teams shipping multi-agent systems in 2026 tend to have:
- 2-4 agents max.
- Clear specialization (not just "another agent").
- Tight message schemas.
- A top-level controller that can stop the whole thing.
- Heavy logging and replay capability.
Teams that ship and regret it tend to have 5+ agents, free-form text coordination, and no orchestration layer.
Check your understanding
2-question self-check
Optional. Your answers feed your knowledge score on the track certificate.
Q1.When does adding more agents USUALLY hurt, not help?
Q2.The #1 design rule the lesson gives for multi-agent systems is…
Continue in this track
More lessons from Building AI Agents.
Lesson 3
Memory systems: short, long, and associative
The three kinds of memory an agent needs and how to build each.
Lesson 4
Planning strategies: ReAct, Plan-and-Execute, and beyond
Different shapes of agent reasoning and when to use each.
Lesson 6
Evaluating agents (this is hard)
Why agent eval is different from LLM eval, and the harness patterns that work.