Introduction
Multi-agent systems promise scalability and smarter reasoning. In production, more agents often means more cost, latency, and new failure modes that don't exist in single-agent designs. The decision to go multi-agent should be load-bearing — driven by a real reason — not a default driven by the diagram looking impressive.
Why this matters
- Each agent multiplies token spend; complex topologies can 10x cost without 10x value.
- Latency stacks: a 5-step chain at 2s/step is 10s, before any retry.
- New failure modes appear at the seams: orchestrator drift, hand-off corruption, conflicting beliefs.
- Many problems that look multi-agent are better solved with one well-designed agent and good tools.
Core concepts
When multi-agent helps
Genuinely parallelisable subproblems, capabilities that need different system prompts or models, and natural role boundaries (e.g. researcher / writer / reviewer) where the boundary buys reliability.
When multi-agent hurts
When the subtasks are tightly coupled, when the supervising agent becomes a bottleneck, when the only "win" is conceptual cleanliness, or when sub-agents need shared state you have to hand-wire.
Topologies
Star (one orchestrator, many workers), pipeline (linear hand-off), debate (parallel agents critiquing), and swarm (peer-to-peer). Each has different cost and latency characteristics.
Hidden costs
Tool schema duplication across agents, context passing tax, observability fragmentation, and prompt-version sprawl.
Practical patterns
Start single-agent
Always benchmark single-agent first. Only split when you can name the bottleneck a split would relieve.
Capacity-bounded fan-out
When you do parallel agents, cap concurrency hard and reuse results aggressively.
Shared scratchpad
Give multi-agent systems a structured shared memory rather than passing prompts back and forth.
Per-agent eval suites
Each agent role gets its own eval; the overall system gets an end-to-end eval. Both are required.
Pitfalls to avoid
- Designing the topology before defining the success metric.
- Letting a "supervisor" agent see everything — it becomes a context bloat black hole.
- Recursive sub-agents without depth limits.
- Confusing "I added agents" with "I added value."
Key takeaways
- 1Multi-agent is a tool, not a virtue.
- 2Default to single-agent; split only with a measurable reason.
- 3Cap depth, fan-out, and total cost at the topology level.
- 4Eval at every seam.
Go deeper · external resources
Curated reading list to take you from primer to practitioner. All links are external and free to read.