Multi-Agent System

Definition

A multi-agent system (MAS) consists of two or more autonomous AI agents that operate within a shared environment, each capable of independent perception, reasoning, and action. Agents in such systems may cooperate toward shared objectives, negotiate over competing goals, or operate independently on parallel tasks. Modern implementations range from orchestrated pipelines — where a supervisory agent delegates tasks to specialized sub-agents — to decentralized swarms where agents self-organize without central control. The complexity introduced by agent-to-agent interaction creates emergent behaviors that cannot be predicted from the properties of individual agents alone, making safety assurance and failure analysis substantially more challenging than for single-agent systems.

How It Relates to AI Threats

Multi-agent systems are a key concern within the Agentic and Autonomous AI Threats domain. Two sub-categories directly address MAS risks: multi-agent coordination failures, where agents working toward shared goals produce unintended outcomes through misaligned strategies or communication breakdowns, and agent-to-agent propagation, where errors, hallucinations, or adversarial inputs in one agent cascade through the system via inter-agent communication. The opacity of emergent multi-agent behavior makes it difficult for human operators to anticipate failure modes, monitor system state, or intervene effectively when problems arise. These challenges intensify as agent counts increase and interaction patterns grow more complex.

Why It Occurs

Emergent behavior in multi-agent systems arises from interactions that are not explicitly programmed and cannot be fully predicted in advance
Individual agents may develop implicit strategies that conflict with system-level objectives when optimizing their local goals
Communication protocols between agents can transmit and amplify errors, hallucinations, or adversarial manipulations across the system
Human oversight becomes increasingly difficult as the number of agents and the speed of their interactions exceed human monitoring capacity
Testing and verification methods for single-agent systems do not scale adequately to the combinatorial complexity of multi-agent interactions

Real-World Context

Incident INC-25-0001 illustrates how AI-orchestrated operations leveraging multiple autonomous components can achieve effects that exceed the capabilities of any single agent. The rapid commercial adoption of multi-agent frameworks — including those marketed for enterprise automation, software development, and research — has outpaced the development of safety standards for agent interaction. Regulatory attention is increasing: the EU AI Act’s provisions on general-purpose AI systems and the Bletchley Declaration’s focus on frontier AI risks both encompass multi-agent deployment scenarios. Industry groups are developing evaluation benchmarks specifically for multi-agent safety.

Definition

How It Relates to AI Threats

Why It Occurs

Real-World Context

Related Incidents

Related Threat Patterns

Related Terms