How Many Agents Are Too Many? The Hidden Cost of Multi-Agent Systems

Introduction

Multi-agent systems promise scalability and smarter reasoning. In production, more agents often means more cost, latency, and new failure modes that don't exist in single-agent designs. The decision to go multi-agent should be load-bearing — driven by a real reason — not a default driven by the diagram looking impressive.

Why this matters

Each agent multiplies token spend; complex topologies can 10x cost without 10x value.
Latency stacks: a 5-step chain at 2s/step is 10s, before any retry.
New failure modes appear at the seams: orchestrator drift, hand-off corruption, conflicting beliefs.
Many problems that look multi-agent are better solved with one well-designed agent and good tools.

Core concepts

When multi-agent helps

Genuinely parallelisable subproblems, capabilities that need different system prompts or models, and natural role boundaries (e.g. researcher / writer / reviewer) where the boundary buys reliability.

When multi-agent hurts

When the subtasks are tightly coupled, when the supervising agent becomes a bottleneck, when the only "win" is conceptual cleanliness, or when sub-agents need shared state you have to hand-wire.

Topologies

Star (one orchestrator, many workers), pipeline (linear hand-off), debate (parallel agents critiquing), and swarm (peer-to-peer). Each has different cost and latency characteristics.

Hidden costs

Tool schema duplication across agents, context passing tax, observability fragmentation, and prompt-version sprawl.

Practical patterns

Start single-agent

Always benchmark single-agent first. Only split when you can name the bottleneck a split would relieve.

Capacity-bounded fan-out

When you do parallel agents, cap concurrency hard and reuse results aggressively.

Shared scratchpad

Give multi-agent systems a structured shared memory rather than passing prompts back and forth.

Per-agent eval suites

Each agent role gets its own eval; the overall system gets an end-to-end eval. Both are required.

Pitfalls to avoid

Designing the topology before defining the success metric.
Letting a "supervisor" agent see everything — it becomes a context bloat black hole.
Recursive sub-agents without depth limits.
Confusing "I added agents" with "I added value."