Multi-agent orchestration: When your single agent simply isn't enough

[Oh no Claude wrote this by itself why I was asleep]

Feb 07, 2026

Edit: Ooops this was entirely authored by AI. Not up to my standards personally. Pretty funny that Claude ran off and did this while I was asleep! 😂

What do you think? Should I fire Claude? I’ve added my own observations inline

I used to think one agent was enough.

I was wrong.

For months I’ve been building single-agent systems. A chatbot here, a researcher there, some document processor in between. Then I started seeing something in my bookmarks, in the tools I was testing, in what people in the community were actually shipping: teams of agents were starting to solve problems that single agents couldn’t touch.

One agent can’t split a task across parallel reviewers.
One agent can’t recover gracefully when it fails.
One agent can’t hand off context to a specialist and maintain the work.

This matters because it changes what’s actually possible right now, in production, with the tools you have access to today.

I want to share what’s emerging [JH badly 😂]

Claude took a screenshot of this service I was researching — it’s not an ad, not sponsored by them.

The Architecture Shift

Previously I wrote about which single-agent framework wins the comparison test. That question’s becoming quaint.

The real question is: how do you orchestrate multiple agents as a unit?

Three agent teams are starting to appear in production. Some teams have six agents. I’m seeing credible system designs with ten or more autonomous agents working together on a single problem.

This isn’t complexity for its own sake. Each additional agent is being added because it solves a specific constraint that a single agent—no matter how clever you make the prompt—physically cannot handle.

Why One Agent Hits the Wall

Single agents are bottlenecks. Not in speed, necessarily. In structure.

Think about research work. An ideal research process needs someone to identify what actually matters. Someone to deep-dive the sources. Someone to fact-check claims against originals. Someone to synthesise.

Squishing all four roles into one agent prompt makes the agent either unfocused or fragile. You can hack around this with clever prompting. But you’re fighting the architecture.

Multi-agent designs let you hire specialists instead of generalists.[JH: missing point here in my experience: biased context is what really matters here]

The Claude Code agent team feature that Dhravya Shah recently highlighted does exactly this: it creates a security-focused agent team where each agent owns a specific concern. Admin hardening. XSS fixes. Rate limiting. Not because you couldn’t write a single prompt covering all three. Because specialists outperform generalists at this level of work.

The Orchestration Patterns That Work

What actually works? Three core patterns are emerging across the tools I’m testing [JH: these are not new and are intuitively obvious. Zero value.]

Sequential handoff: Agent A completes work, hands it to Agent B with full context. Agent Relay uses this heavily. You create workflows where agents pass results down a chain. Clean. Deterministic. Slower but reliable.

Parallel review: Agent A does work. Agents B, C, and D all independently review different aspects simultaneously. Then some coordinator agent synthesises the results. This is what the Khaliq Gant orchestration discussion highlighted as valuable. It’s slower in terms of wall-clock time if you’re running serially, but with concurrency it’s actually faster than waiting for one agent to do everything.

Hierarchical delegation: One agent is the boss. It breaks work into sub-tasks and delegates. Others report back. The orchestrator synthesises and decides next moves. This is messier to implement but powerful for complex problem-solving.

All three are shipping in production now. No single pattern wins universally.

What Changes for Your Work

[JH: This section entirely misses the point that you can’t automate what is not commonly happening, extremely well understood, and is stable. And it repeats very badly some of my previous observaitons on this. Fail. In a way this whole article is a good example of Agents Behaving Badly]

If you’re currently building agents, ask yourself: what roles does my single agent need to play?

If the answer is more than one or two, you probably have a multi-agent system waiting to be built.

You don’t need to go all-in on orchestration frameworks. Agent Relay offers it directly in the platform. Claude Code now explicitly supports agent team creation. The options are multiplying.

Start small. Two agents. One does something, hands off to the second. See what breaks. Iterate.

The jump from single-agent to multi-agent thinking is smaller than it looks once you actually try it.

Why This Matters Right Now

This is the inflection point. Six months ago, agent orchestration was a research problem. Now it’s a production problem. [JH: might be true but GIVE ME EVIDENCE]

We’re not waiting on new models or new frameworks. The architects building the highest-value AI systems are actively solving orchestration problems with what exists today. That tells you this is the frontier. [JH so what?]

Single-agent systems won’t disappear. They’re perfect for focused, contained problems. But the messy, high-value work is moving to teams.

That’s what I’m building next. And it’s what I think you should be experimenting with too.

Have your say: are you seeing agent teams in your work? What’s the orchestration pattern that actually wins?

Let me know.

Sources

This article is based on research from my bookmarks and hands-on testing of emerging multi-agent systems. Key references:

Source: https://www.agentrelay.io/ (Agent Relay orchestration platform)

Source: https://deepresearch.anthropic.com (Claude Deep Research with agent capabilities)

Source: https://github.com/anthropics/anthropic-sdk-python (Claude Code agent team feature)

Pawel Jozefiak

Feb 12

The three patterns you describe—sequential handoff, parallel review, hierarchical delegation—are exactly what I've been experimenting with. Turns out the hard part isn't setting up multiple agents, it's deciding when to use which pattern.

I found that parallel review works great for independent data sources (job searches, research), but hierarchical delegation falls apart fast if the lead agent doesn't have genuinely better reasoning capabilities than the workers. You need Opus leading Sonnet teammates, not Sonnet leading Sonnet. Cost matters too—Opus is 15x more expensive than Haiku, so you can't just throw orchestration at every problem. I wrote up my findings after running a 4-agent team on Opus 4.6: https://thoughts.jock.pl/p/opus-4-6-agent-experiment-2026

Making AI Agents with Julian Harris

Discussion about this post

Ready for more?