Why Multi-Agent Security Isn't Optional: What Google and MIT Found

Google, MIT, and DeepMind just published research that should worry anyone deploying multi-agent systems without proper security infrastructure.

The paper, “Towards a Science of Scaling Agent Systems,” systematically evaluated when multi-agent coordination helps versus hurts—across 180 configurations, 5 architectures, and 3 LLM families. The findings move beyond “more agents is all you need” to quantify exactly how multi-agent systems fail.

The headline number: uncoordinated agents amplify errors 17x.

The Experiment

The researchers tested five canonical architectures:

Single-Agent (SAS): One agent, full context
Independent MAS: Multiple agents, no coordination
Centralized MAS: Orchestrator validates before accepting
Decentralized MAS: Agents coordinate peer-to-peer
Hybrid MAS: Mixed coordination patterns

They standardized tools, prompts, and token budgets to isolate architectural effects. The goal: derive quantitative scaling principles rather than rely on intuition.

Error Amplification Is Architecture-Dependent

The most striking finding is how dramatically architecture affects error propagation:

Architecture	Error Amplification
Single-Agent	1.0x (baseline)
Centralized	4.4x
Hybrid	5.1x
Decentralized	7.8x
Independent	17.2x

Independent agents—multiple agents operating without coordination—amplify errors 17 times compared to a single agent. Centralized architectures reduce this to 4.4x through validation bottlenecks that catch errors before they propagate.

This is the confused deputy problem from operating systems security, empirically validated at scale in AI systems.

Coordination Overhead Scales Super-Linearly

The paper also quantifies coordination costs:

Architecture	Overhead vs. Single-Agent
Single-Agent	0%
Independent	58%
Decentralized	263%
Centralized	285%
Hybrid	515%

Turn count follows a power law: T = 2.72 × (n+0.5)^1.724

Beyond 3-4 agents, per-agent reasoning capacity becomes “prohibitively thin.” Hybrid systems require 6.2x more turns than single-agent approaches.

Multi-Agent Still Wins—On the Right Tasks

The research doesn’t argue against multi-agent systems. It argues for matching architecture to task structure:

Task Type	Best Architecture	Performance Delta
Financial reasoning (parallelizable)	Centralized	+80.9%
Web navigation (dynamic)	Decentralized	+9.2%
Sequential planning	Single-Agent	MAS degrades 39-70%

Centralized coordination excels on parallelizable tasks. Decentralized works for dynamic environments. Sequential reasoning tasks degrade with any multi-agent approach.

The takeaway: multi-agent coordination creates value when tasks decompose naturally. It destroys value when forced onto sequential workflows.

What This Means for Security

Here’s where the research gets relevant for anyone building agent infrastructure.

The 75% reduction in error amplification from centralized architectures comes from validation bottlenecks—checkpoints that verify agent outputs before accepting them. In the paper, that’s an orchestrator agent reviewing other agents’ reasoning.

But validation bottlenecks can exist at multiple levels:

Reasoning validation: “Is this agent’s output logically sound?”
Action validation: “Is this agent authorized to take this action?”

The paper studies reasoning validation. Capability-based security provides action validation.

These are complementary, not competing. An orchestrator can verify that Agent B’s analysis is correct. A capability system verifies that Agent B can only call transfer() for amounts under $500 to pre-approved accounts.

Guardrails, Not Coordinators

This distinction matters for understanding what different infrastructure layers provide.

Coordinators (what the paper studies):

“Agent B, review Agent A’s reasoning before we proceed”
Manages how agents think together
Reduces errors through cognitive validation

Guardrails (what capability security provides):

“Agent A can only call these functions with these constraints”
Manages what agents can do
Reduces damage through action constraints

You need both. The paper shows that uncoordinated reasoning amplifies errors 17x. But even perfectly coordinated agents can cause damage if they have unrestricted access to sensitive operations.

A well-reasoned decision to transfer $50,000 is still a problem if the agent was only authorized for $500.

The Market Reality

The paper’s efficiency findings might seem like a bear case for multi-agent systems. They’re not.

Multi-agent is happening regardless of efficiency concerns:

46% of YC’s Spring 2025 batch was AI agent companies
Anthropic, OpenAI, and Google are all shipping agent frameworks
Enterprise deployments are accelerating

The market doesn’t optimize for theoretical efficiency—it builds what solves business problems. And multi-agent architectures solve problems that single agents can’t: parallelization, specialization, cross-organizational workflows.

The question isn’t whether multi-agent systems will be built. It’s whether they’ll be built with proper infrastructure.

The Infrastructure Gap

The paper identifies coordination overhead as a key challenge. Current solutions focus on reasoning coordination—better prompts, smarter orchestrators, more sophisticated handoff protocols.

What’s missing is action coordination: cryptographic enforcement of what agents can actually do.

When the research shows that centralized validation reduces error amplification by 75%, that’s validation at the reasoning layer. Add validation at the action layer—capability constraints, attenuation at each hop, revocation that propagates—and you get defense in depth.

Multi-agent systems need both:

Coordination infrastructure that reduces reasoning errors
Security infrastructure that constrains action scope

The first is being actively researched. The second is the missing layer.

This is part of a series on AI agent security. See The Missing Layer for the authorization gap, and Proof of Continuity for how capability-based security addresses it.