Skip to main content
TopAIThreats home TOP AI THREATS
Harm Mechanism

Contagion

The spread of harmful outputs, compromised states, or adversarial inputs between connected AI agents.

Definition

Contagion in AI systems describes the propagation of compromised states, adversarial payloads, or harmful outputs from one AI agent to others within interconnected networks. When AI agents communicate, share data, or delegate tasks to one another, a vulnerability in a single agent can spread throughout the network. This propagation can occur through shared memory stores, tool outputs, inter-agent messaging, or contaminated data passed between systems. Unlike isolated failures, contagion creates systemic risk because the speed and autonomy of agent-to-agent communication often outpaces human monitoring capacity.

How It Relates to AI Threats

Contagion is a critical harm mechanism within Agentic & Autonomous threats, where multi-agent architectures create pathways for cascading compromise. As organisations deploy networks of AI agents that collaborate on complex tasks, the attack surface expands beyond individual systems to include inter-agent communication channels. A single compromised agent can inject malicious instructions or corrupted data into the broader network, potentially affecting downstream decisions, actions, and outputs across multiple autonomous systems simultaneously.

Why It Occurs

  • AI agents share memory and context without sufficient verification
  • Inter-agent communication protocols lack adversarial robustness
  • Compromised outputs become trusted inputs for downstream agents
  • Agent networks operate faster than human oversight can monitor
  • Trust relationships between agents are assumed rather than verified

Real-World Context

As multi-agent AI deployments grow in enterprise settings, security researchers have demonstrated how prompt injection attacks against one agent can cascade through orchestrated agent networks. In simulated environments, a single compromised agent providing tainted tool outputs has been shown to influence the reasoning of multiple downstream agents, leading to coordinated erroneous actions without triggering conventional monitoring systems.

Last updated: 2026-02-14