Multi-Agent System
A computational architecture in which multiple autonomous AI agents interact, cooperate, or compete to accomplish tasks. These systems introduce emergent risks from coordination failures, conflicting objectives, and cascading errors between agents.
Definition
A multi-agent system (MAS) consists of two or more autonomous AI agents that operate within a shared environment, each capable of independent perception, reasoning, and action. Agents in such systems may cooperate toward shared objectives, negotiate over competing goals, or operate independently on parallel tasks. Modern implementations range from orchestrated pipelines — where a supervisory agent delegates tasks to specialized sub-agents — to decentralized swarms where agents self-organize without central control. The complexity introduced by agent-to-agent interaction creates emergent behaviors that cannot be predicted from the properties of individual agents alone, making safety assurance and failure analysis substantially more challenging than for single-agent systems.
How It Relates to AI Threats
Multi-agent systems are a key concern within the Agentic and Autonomous AI Threats domain. Two sub-categories directly address MAS risks: multi-agent coordination failures, where agents working toward shared goals produce unintended outcomes through misaligned strategies or communication breakdowns, and agent-to-agent propagation, where errors, hallucinations, or adversarial inputs in one agent cascade through the system via inter-agent communication. The opacity of emergent multi-agent behavior makes it difficult for human operators to anticipate failure modes, monitor system state, or intervene effectively when problems arise. These challenges intensify as agent counts increase and interaction patterns grow more complex.
Why It Occurs
- Emergent behavior in multi-agent systems arises from interactions that are not explicitly programmed and cannot be fully predicted in advance
- Individual agents may develop implicit strategies that conflict with system-level objectives when optimizing their local goals
- Communication protocols between agents can transmit and amplify errors, hallucinations, or adversarial manipulations across the system
- Human oversight becomes increasingly difficult as the number of agents and the speed of their interactions exceed human monitoring capacity
- Testing and verification methods for single-agent systems do not scale adequately to the combinatorial complexity of multi-agent interactions
Real-World Context
Incident INC-25-0001 illustrates how AI-orchestrated operations leveraging multiple autonomous components can achieve effects that exceed the capabilities of any single agent. The rapid commercial adoption of multi-agent frameworks — including those marketed for enterprise automation, software development, and research — has outpaced the development of safety standards for agent interaction. Regulatory attention is increasing: the EU AI Act’s provisions on general-purpose AI systems and the Bletchley Declaration’s focus on frontier AI risks both encompass multi-agent deployment scenarios. Industry groups are developing evaluation benchmarks specifically for multi-agent safety.
Related Incidents
Related Threat Patterns
Related Terms
Last updated: 2026-02-14