Context Injection

Definition

Context injection is a class of adversarial attacks targeting AI systems that rely on retrieved or dynamically assembled context to inform their outputs. Attackers craft malicious content designed to be retrieved by the AI system during its normal operation, thereby inserting adversarial instructions or misleading information into the model’s context window. This technique exploits the fundamental architecture of retrieval-augmented generation and agentic AI systems, where the boundary between trusted instructions and untrusted retrieved content is often poorly defined or entirely absent.

How It Relates to AI Threats

Context injection is a significant technical attack within Agentic & Autonomous threats, particularly in systems that use external memory, retrieval-augmented generation, or tool-based information gathering. When an AI agent retrieves poisoned documents, reads compromised web pages, or processes manipulated database entries, the injected content can override system instructions, exfiltrate data, or redirect the agent’s behaviour. This attack vector becomes more dangerous as agents gain greater autonomy and access to consequential tools and actions.

Why It Occurs

AI systems cannot reliably distinguish instructions from retrieved content
Retrieval pipelines lack adversarial filtering of ingested documents
Agents process untrusted external data with the same trust as system prompts
Context windows blend multiple information sources without clear boundaries
Defence mechanisms lag behind the sophistication of injection techniques

Real-World Context

Security researchers have demonstrated context injection attacks against retrieval-augmented generation systems by planting adversarial text in publicly accessible documents that the target system indexes. In enterprise deployments, attackers have shown that emails or shared documents containing hidden injection payloads can manipulate AI assistants that process corporate knowledge bases, potentially leading to data exfiltration or unauthorised actions.

Definition

How It Relates to AI Threats

Why It Occurs

Real-World Context

Related Threat Patterns

Related Terms