Skip to main content
TopAIThreats home TOP AI THREATS
How-To Guide

How to Build an AI Incident Response Plan

A 5-phase AI incident response framework covering detection, containment, investigation, remediation, and regulatory reporting—including EU AI Act Article 62 obligations and AIID submission guidance.

Last updated: 2026-03-15

Who this is for: Risk officers, security teams, product owners, and compliance professionals responsible for AI systems that could cause harm. Particularly relevant for organizations operating AI under the EU AI Act, HIPAA, or other regulated contexts.

An AI incident response plan covers five phases: (1) detection and triage, (2) containment, (3) investigation, (4) remediation and recovery, and (5) reporting and notification. AI incidents differ from conventional software incidents in two critical ways: harm may be diffuse (affecting many people in small ways rather than one system catastrophically), and root causes often involve model behavior rather than code defects—requiring different investigation and remediation skills. This guide covers all five phases, regulatory reporting obligations, and how to submit incidents to external databases.

What Qualifies as an AI Incident

An AI incident is any event where an AI system causes, contributes to, or narrowly avoids harm. The topaithreats incident database classifies incidents across four failure stages:

StageDefinitionExample
SignalEarly indicator that harm may occur; no harm yetModel begins producing subtly biased outputs in testing
Near missHarmful output or action that was stopped before impactInjected instruction caught by approval gate before email was sent
HarmHarm reached one or more affected partiesAI hiring tool discriminated against protected class; deepfake fraud succeeded
Systemic riskWidespread or structural harm affecting many parties or critical systemsAI model used in financial decisions produces systematic errors affecting thousands

Severity tiers for response prioritization:

TierCriteriaResponse SLA
P1 — CriticalOngoing harm to people, data exfiltration in progress, safety-critical system failure, or EU AI Act serious incidentImmediate response; executive escalation within 1 hour
P2 — HighHarm occurred but contained; material data exposure; regulatory notification likely requiredAcknowledge within 4 hours; investigation within 24 hours
P3 — MediumLimited harm; single-user impact; no regulatory notification requiredInvestigate within 72 hours
P4 — LowNear miss; no harm occurred; used for learningLog and review in next security cycle

Phase 1: Detection and Triage

AI incidents surface through multiple channels—monitoring alerts, user reports, internal discovery, third-party disclosure, or media reporting. Detection readiness requires:

Monitoring infrastructure: Automated monitoring for behavioral anomalies (injection attempt patterns, anomalous tool call sequences, output format deviations, cross-tenant data signals) should be in place before an incident occurs. See AI Security Best Practices for monitoring signal guidance.

Intake channels: Users, employees, and security researchers need a documented way to report suspected AI incidents. Publish a responsible disclosure contact for external reporters. Maintain an internal incident intake form with fields for: suspected incident type, AI system involved, observed behavior, affected parties, and time of observation.

Triage criteria: On receiving a report, triage assigns a severity tier (P1–P4) and an incident type:

  • Security incident (prompt injection, data exfiltration, unauthorized access)
  • Safety incident (harmful output, discriminatory decision, dangerous recommendation)
  • Privacy incident (PII exposure, cross-tenant data leakage)
  • Reliability incident (significant model failure, erroneous output at scale)

Incident owner assignment: Every incident requires a named owner responsible for driving it to resolution. The incident owner is distinct from the engineer who investigates—the owner coordinates, communicates, and ensures the incident is not dropped.

Phase 2: Containment

Containment limits ongoing harm while investigation proceeds. AI-specific containment actions:

Isolate or throttle the affected system: For P1/P2 incidents, consider disabling the affected AI feature, routing traffic away from the affected model endpoint, or switching to a fallback configuration while the incident is investigated. The ability to roll back to a previous model version or system prompt is a containment prerequisite—establish rollback capability before deployment, not during an incident.

Preserve evidence: Before making any changes to the affected system, preserve:

  • Model version and configuration at time of incident
  • System prompt (if prompt change is a likely cause)
  • Relevant input/output logs surrounding the incident window
  • Tool call logs if the incident involves agentic behavior
  • Any retrieved content (RAG documents, tool outputs) that was processed during the incident

Evidence preservation is required for investigation and may be required for regulatory reporting. Do not modify or delete logs until the investigation is complete.

Limit harm propagation: If the incident involves data exposure, notify affected users or tenants promptly to enable them to take protective action. If the incident involves ongoing harmful outputs, add temporary content filters or rate limiting on the affected output path.

Notify stakeholders: Escalate P1/P2 incidents to executive leadership and legal/compliance teams immediately. Do not wait for investigation to be complete before internal notification.

Phase 3: Investigation

AI incident investigation requires different skills than conventional security incident response. Root causes may involve model behavior, training data, system prompt design, retrieval pipeline configuration, or the interaction between these layers.

Root cause analysis: Map the incident to its causal factors using topaithreats’ causal factor taxonomy. Common root causes:

Affected party identification: Determine who was harmed and to what extent. AI harms are often diffuse—a biased model may have affected many individuals in small ways without any single person knowing. Scope assessment requires examining decision logs, not only complaint records.

Timeline reconstruction: Build a chronological timeline from logs: when the vulnerability was introduced, when it became exploitable, when exploitation first occurred (if applicable), and when it was detected. This timeline is required for regulatory reporting and for preventing recurrence.

Reproduction: Attempt to reproduce the incident in a controlled environment to confirm the root cause hypothesis. Document the reproduction steps—these feed the findings register and inform remediation testing.

AI-Specific Investigation Playbooks

Each incident type requires a different investigation focus. Use the appropriate playbook based on the triage classification from Phase 1:

Incident TypePrimary Investigation StepsKey EvidenceSpecialist Required
Prompt injection1. Identify injection vector (direct/indirect/cross-tool) 2. Determine if system prompt was extracted 3. Assess data exfiltration scope 4. Check if injected instructions persisted in memory/RAGInput/output logs, tool call sequence, RAG retrieval logsAppSec engineer
Data exfiltration1. Identify what data was accessed 2. Trace the exfiltration path (tool calls, URLs, email sends) 3. Determine if exfiltration was injection-driven or misconfiguration-driven 4. Scope affected users/tenantsAgent tool call audit logs, network logs, tenant access logsSecurity + ML Ops
Harmful/biased output1. Collect affected outputs with demographic metadata 2. Run disparate impact analysis on historical outputs 3. Identify whether root cause is training data, system prompt, or retrieval content 4. Assess scope (how many decisions affected)Output logs with demographic data, model version, training data provenanceResponsible AI / ML engineer
Hallucination causing harm1. Identify the specific fabricated claim 2. Determine whether RAG grounding was active and what was retrieved 3. Check if the hallucination is reproducible or stochastic 4. Assess downstream actions taken on the hallucinated outputRAG retrieval logs, model output logs, downstream action recordsML engineer
Agent autonomy failure1. Reconstruct the full tool call sequence 2. Identify where the agent exceeded its intended scope 3. Check if human approval gates were bypassed or absent 4. Determine if the failure was injection-driven or goal-driftTool call audit log with timestamps, permission grants, approval gate logsML engineer + Platform
Privacy/PII exposure1. Identify what PII was exposed and to whom 2. Determine whether PII came from training data memorization, RAG retrieval, or cross-tenant leakage 3. Assess regulatory notification obligations 4. Scope: how many individuals’ data was exposedOutput logs, retrieval logs, tenant boundary auditPrivacy + Legal

For each playbook, the investigation should produce: confirmed root cause, scope of impact, timeline of exposure, and a remediation recommendation that feeds directly into Phase 4.

Phase 4: Remediation and Recovery

Remediation addresses the root cause identified in Phase 3. Common remediation types for AI incidents:

Root CauseRemediation
Prompt injectionPrivilege separation architecture, input validation, output filtering
Misconfigured deploymentConfiguration audit, default settings hardening, permission review
Insufficient safety testingExpanded red team coverage, additional test cases for the failed category
Training data biasDataset audit, retraining with balanced data, output monitoring for affected decision types
Excessive agent permissionsReduce tool access scope, implement action allowlist, add human approval gates

Remediation verification: Re-test the specific failure scenario after remediation to confirm closure. For P1/P2 findings, also run a broader regression test to ensure remediation did not introduce new failures.

Residual risk acceptance: For incidents where full remediation is not feasible before re-enabling the affected system, document the residual risk explicitly. Residual risk requires sign-off from a product owner or risk committee—not the engineering team alone. The residual risk record feeds the organization’s AI risk register.

System re-enablement: Before re-enabling a disabled system, complete: root cause remediated (or residual risk accepted), remediation verified by re-test, affected parties notified (if required), and regulatory reporting completed (if required).

Phase 5: Reporting and Notification

AI incident reporting obligations vary by jurisdiction, industry, and incident severity. This phase has hard deadlines in some contexts.

EU AI Act — Serious Incident Reporting (Article 62): For high-risk AI systems under the EU AI Act, providers must report serious incidents (those that result in death, serious harm to health, significant disruption to critical infrastructure, or violation of fundamental rights) to the relevant national competent authority. The reporting obligation is triggered when the provider becomes aware of the incident. Consult legal counsel for specific notification timelines applicable to your jurisdiction.

GDPR/NIS2 data breach overlap: If an AI incident involves a personal data breach, GDPR Article 33 requires notification to the supervisory authority within 72 hours. If the incident affects network and information systems in scope of NIS2, NIS2 reporting obligations apply independently of the AI Act.

Internal reporting: Every incident, regardless of severity tier, requires a written incident report. The report documents: incident timeline, root cause, affected parties, containment actions, remediation applied, residual risk (if any), and lessons learned. This report updates the AI risk register and informs model card updates.

Reporting to external databases:

AIID (AI Incident Database): Submit incidents to the AI Incident Database at incidentdatabase.ai. The AIID accepts reports of AI-related harms from any submitter. Submission requires: a description of the incident, the AI system involved, the date, affected parties, and links to source documentation.

topaithreats: Submit incidents at the contributing page with supporting evidence, source links, and causal factor mapping.

Post-incident review: Schedule a post-incident review 1–2 weeks after resolution while details are fresh. Review agenda: what happened, why detection took the time it did, whether response procedures worked, and what process or technical changes prevent recurrence. Document outcomes and assign follow-up actions with owners and deadlines.

RACI: Roles and Responsibilities

ActivityResponsibleAccountableConsultedInformed
Incident detection and intakeSecurity / ML OpsIncident OwnerEngineeringLeadership
Severity triageIncident OwnerCISO / Risk OfficerLegalAffected team leads
ContainmentEngineeringIncident OwnerSecurityLeadership (P1/P2)
InvestigationEngineering / MLIncident OwnerLegal, PrivacyRisk Officer
Regulatory notificationLegal / ComplianceCISOIncident OwnerExecutive
External database submissionSecurity / Trust & SafetyRisk OfficerLegal
Post-incident reviewIncident OwnerRisk OfficerAll involvedLeadership

Incident Response Readiness Checklist

Before an incident occurs, verify: