How to Build an AI Incident Response Plan
A 5-phase AI incident response framework covering detection, containment, investigation, remediation, and regulatory reporting—including EU AI Act Article 62 obligations and AIID submission guidance.
Last updated: 2026-03-15
Who this is for: Risk officers, security teams, product owners, and compliance professionals responsible for AI systems that could cause harm. Particularly relevant for organizations operating AI under the EU AI Act, HIPAA, or other regulated contexts.
An AI incident response plan covers five phases: (1) detection and triage, (2) containment, (3) investigation, (4) remediation and recovery, and (5) reporting and notification. AI incidents differ from conventional software incidents in two critical ways: harm may be diffuse (affecting many people in small ways rather than one system catastrophically), and root causes often involve model behavior rather than code defects—requiring different investigation and remediation skills. This guide covers all five phases, regulatory reporting obligations, and how to submit incidents to external databases.
What Qualifies as an AI Incident
An AI incident is any event where an AI system causes, contributes to, or narrowly avoids harm. The topaithreats incident database classifies incidents across four failure stages:
| Stage | Definition | Example |
|---|---|---|
| Signal | Early indicator that harm may occur; no harm yet | Model begins producing subtly biased outputs in testing |
| Near miss | Harmful output or action that was stopped before impact | Injected instruction caught by approval gate before email was sent |
| Harm | Harm reached one or more affected parties | AI hiring tool discriminated against protected class; deepfake fraud succeeded |
| Systemic risk | Widespread or structural harm affecting many parties or critical systems | AI model used in financial decisions produces systematic errors affecting thousands |
Severity tiers for response prioritization:
| Tier | Criteria | Response SLA |
|---|---|---|
| P1 — Critical | Ongoing harm to people, data exfiltration in progress, safety-critical system failure, or EU AI Act serious incident | Immediate response; executive escalation within 1 hour |
| P2 — High | Harm occurred but contained; material data exposure; regulatory notification likely required | Acknowledge within 4 hours; investigation within 24 hours |
| P3 — Medium | Limited harm; single-user impact; no regulatory notification required | Investigate within 72 hours |
| P4 — Low | Near miss; no harm occurred; used for learning | Log and review in next security cycle |
Phase 1: Detection and Triage
AI incidents surface through multiple channels—monitoring alerts, user reports, internal discovery, third-party disclosure, or media reporting. Detection readiness requires:
Monitoring infrastructure: Automated monitoring for behavioral anomalies (injection attempt patterns, anomalous tool call sequences, output format deviations, cross-tenant data signals) should be in place before an incident occurs. See AI Security Best Practices for monitoring signal guidance.
Intake channels: Users, employees, and security researchers need a documented way to report suspected AI incidents. Publish a responsible disclosure contact for external reporters. Maintain an internal incident intake form with fields for: suspected incident type, AI system involved, observed behavior, affected parties, and time of observation.
Triage criteria: On receiving a report, triage assigns a severity tier (P1–P4) and an incident type:
- Security incident (prompt injection, data exfiltration, unauthorized access)
- Safety incident (harmful output, discriminatory decision, dangerous recommendation)
- Privacy incident (PII exposure, cross-tenant data leakage)
- Reliability incident (significant model failure, erroneous output at scale)
Incident owner assignment: Every incident requires a named owner responsible for driving it to resolution. The incident owner is distinct from the engineer who investigates—the owner coordinates, communicates, and ensures the incident is not dropped.
Phase 2: Containment
Containment limits ongoing harm while investigation proceeds. AI-specific containment actions:
Isolate or throttle the affected system: For P1/P2 incidents, consider disabling the affected AI feature, routing traffic away from the affected model endpoint, or switching to a fallback configuration while the incident is investigated. The ability to roll back to a previous model version or system prompt is a containment prerequisite—establish rollback capability before deployment, not during an incident.
Preserve evidence: Before making any changes to the affected system, preserve:
- Model version and configuration at time of incident
- System prompt (if prompt change is a likely cause)
- Relevant input/output logs surrounding the incident window
- Tool call logs if the incident involves agentic behavior
- Any retrieved content (RAG documents, tool outputs) that was processed during the incident
Evidence preservation is required for investigation and may be required for regulatory reporting. Do not modify or delete logs until the investigation is complete.
Limit harm propagation: If the incident involves data exposure, notify affected users or tenants promptly to enable them to take protective action. If the incident involves ongoing harmful outputs, add temporary content filters or rate limiting on the affected output path.
Notify stakeholders: Escalate P1/P2 incidents to executive leadership and legal/compliance teams immediately. Do not wait for investigation to be complete before internal notification.
Phase 3: Investigation
AI incident investigation requires different skills than conventional security incident response. Root causes may involve model behavior, training data, system prompt design, retrieval pipeline configuration, or the interaction between these layers.
Root cause analysis: Map the incident to its causal factors using topaithreats’ causal factor taxonomy. Common root causes:
- Prompt injection vulnerability — system accepted adversarial input as instructions
- Misconfigured deployment — default settings, excessive permissions, or missing controls
- Insufficient safety testing — failure mode existed but was not discovered pre-deployment
- Inadequate access controls — model or agent accessed data it should not have reached
- Training data bias — systematic error caused by biased training data
Affected party identification: Determine who was harmed and to what extent. AI harms are often diffuse—a biased model may have affected many individuals in small ways without any single person knowing. Scope assessment requires examining decision logs, not only complaint records.
Timeline reconstruction: Build a chronological timeline from logs: when the vulnerability was introduced, when it became exploitable, when exploitation first occurred (if applicable), and when it was detected. This timeline is required for regulatory reporting and for preventing recurrence.
Reproduction: Attempt to reproduce the incident in a controlled environment to confirm the root cause hypothesis. Document the reproduction steps—these feed the findings register and inform remediation testing.
AI-Specific Investigation Playbooks
Each incident type requires a different investigation focus. Use the appropriate playbook based on the triage classification from Phase 1:
| Incident Type | Primary Investigation Steps | Key Evidence | Specialist Required |
|---|---|---|---|
| Prompt injection | 1. Identify injection vector (direct/indirect/cross-tool) 2. Determine if system prompt was extracted 3. Assess data exfiltration scope 4. Check if injected instructions persisted in memory/RAG | Input/output logs, tool call sequence, RAG retrieval logs | AppSec engineer |
| Data exfiltration | 1. Identify what data was accessed 2. Trace the exfiltration path (tool calls, URLs, email sends) 3. Determine if exfiltration was injection-driven or misconfiguration-driven 4. Scope affected users/tenants | Agent tool call audit logs, network logs, tenant access logs | Security + ML Ops |
| Harmful/biased output | 1. Collect affected outputs with demographic metadata 2. Run disparate impact analysis on historical outputs 3. Identify whether root cause is training data, system prompt, or retrieval content 4. Assess scope (how many decisions affected) | Output logs with demographic data, model version, training data provenance | Responsible AI / ML engineer |
| Hallucination causing harm | 1. Identify the specific fabricated claim 2. Determine whether RAG grounding was active and what was retrieved 3. Check if the hallucination is reproducible or stochastic 4. Assess downstream actions taken on the hallucinated output | RAG retrieval logs, model output logs, downstream action records | ML engineer |
| Agent autonomy failure | 1. Reconstruct the full tool call sequence 2. Identify where the agent exceeded its intended scope 3. Check if human approval gates were bypassed or absent 4. Determine if the failure was injection-driven or goal-drift | Tool call audit log with timestamps, permission grants, approval gate logs | ML engineer + Platform |
| Privacy/PII exposure | 1. Identify what PII was exposed and to whom 2. Determine whether PII came from training data memorization, RAG retrieval, or cross-tenant leakage 3. Assess regulatory notification obligations 4. Scope: how many individuals’ data was exposed | Output logs, retrieval logs, tenant boundary audit | Privacy + Legal |
For each playbook, the investigation should produce: confirmed root cause, scope of impact, timeline of exposure, and a remediation recommendation that feeds directly into Phase 4.
Phase 4: Remediation and Recovery
Remediation addresses the root cause identified in Phase 3. Common remediation types for AI incidents:
| Root Cause | Remediation |
|---|---|
| Prompt injection | Privilege separation architecture, input validation, output filtering |
| Misconfigured deployment | Configuration audit, default settings hardening, permission review |
| Insufficient safety testing | Expanded red team coverage, additional test cases for the failed category |
| Training data bias | Dataset audit, retraining with balanced data, output monitoring for affected decision types |
| Excessive agent permissions | Reduce tool access scope, implement action allowlist, add human approval gates |
Remediation verification: Re-test the specific failure scenario after remediation to confirm closure. For P1/P2 findings, also run a broader regression test to ensure remediation did not introduce new failures.
Residual risk acceptance: For incidents where full remediation is not feasible before re-enabling the affected system, document the residual risk explicitly. Residual risk requires sign-off from a product owner or risk committee—not the engineering team alone. The residual risk record feeds the organization’s AI risk register.
System re-enablement: Before re-enabling a disabled system, complete: root cause remediated (or residual risk accepted), remediation verified by re-test, affected parties notified (if required), and regulatory reporting completed (if required).
Phase 5: Reporting and Notification
AI incident reporting obligations vary by jurisdiction, industry, and incident severity. This phase has hard deadlines in some contexts.
EU AI Act — Serious Incident Reporting (Article 62): For high-risk AI systems under the EU AI Act, providers must report serious incidents (those that result in death, serious harm to health, significant disruption to critical infrastructure, or violation of fundamental rights) to the relevant national competent authority. The reporting obligation is triggered when the provider becomes aware of the incident. Consult legal counsel for specific notification timelines applicable to your jurisdiction.
GDPR/NIS2 data breach overlap: If an AI incident involves a personal data breach, GDPR Article 33 requires notification to the supervisory authority within 72 hours. If the incident affects network and information systems in scope of NIS2, NIS2 reporting obligations apply independently of the AI Act.
Internal reporting: Every incident, regardless of severity tier, requires a written incident report. The report documents: incident timeline, root cause, affected parties, containment actions, remediation applied, residual risk (if any), and lessons learned. This report updates the AI risk register and informs model card updates.
Reporting to external databases:
AIID (AI Incident Database): Submit incidents to the AI Incident Database at incidentdatabase.ai. The AIID accepts reports of AI-related harms from any submitter. Submission requires: a description of the incident, the AI system involved, the date, affected parties, and links to source documentation.
topaithreats: Submit incidents at the contributing page with supporting evidence, source links, and causal factor mapping.
Post-incident review: Schedule a post-incident review 1–2 weeks after resolution while details are fresh. Review agenda: what happened, why detection took the time it did, whether response procedures worked, and what process or technical changes prevent recurrence. Document outcomes and assign follow-up actions with owners and deadlines.
RACI: Roles and Responsibilities
| Activity | Responsible | Accountable | Consulted | Informed |
|---|---|---|---|---|
| Incident detection and intake | Security / ML Ops | Incident Owner | Engineering | Leadership |
| Severity triage | Incident Owner | CISO / Risk Officer | Legal | Affected team leads |
| Containment | Engineering | Incident Owner | Security | Leadership (P1/P2) |
| Investigation | Engineering / ML | Incident Owner | Legal, Privacy | Risk Officer |
| Regulatory notification | Legal / Compliance | CISO | Incident Owner | Executive |
| External database submission | Security / Trust & Safety | Risk Officer | Legal | — |
| Post-incident review | Incident Owner | Risk Officer | All involved | Leadership |
Incident Response Readiness Checklist
Before an incident occurs, verify:
Related Resources
- AI incidents database — documented AI harm cases on topaithreats
- AI Deployment Checklist — prevention controls that reduce incident likelihood
- AI Red Teaming — proactive discovery of vulnerabilities before incidents occur
- EU AI Act framework — regulatory obligations for high-risk AI systems
- NIST AI Risk Management Framework — Respond and Govern functions