A 5-phase AI incident response framework covering detection, containment, investigation, remediation, and regulatory reporting—including EU AI Act Article 62 obligations and AIID submission guidance.

Who this is for: Risk officers, security teams, product owners, and compliance professionals responsible for AI systems that could cause harm. Particularly relevant for organizations operating AI under the EU AI Act, HIPAA, or other regulated contexts.

An AI incident response plan covers five phases: (1) detection and triage, (2) containment, (3) investigation, (4) remediation and recovery, and (5) reporting and notification. AI incidents differ from conventional software incidents in two critical ways: harm may be diffuse (affecting many people in small ways rather than one system catastrophically), and root causes often involve model behavior rather than code defects—requiring different investigation and remediation skills. This guide covers all five phases, regulatory reporting obligations, and how to submit incidents to external databases.

What Qualifies as an AI Incident

An AI incident is any event where an AI system causes, contributes to, or narrowly avoids harm. The topaithreats incident database classifies incidents across four failure stages:

Stage	Definition	Example
Signal	Early indicator that harm may occur; no harm yet	Model begins producing subtly biased outputs in testing
Near miss	Harmful output or action that was stopped before impact	Injected instruction caught by approval gate before email was sent
Harm	Harm reached one or more affected parties	AI hiring tool discriminated against protected class; deepfake fraud succeeded
Systemic risk	Widespread or structural harm affecting many parties or critical systems	AI model used in financial decisions produces systematic errors affecting thousands

Severity tiers for response prioritization:

Tier	Criteria	Response SLA
P1 — Critical	Ongoing harm to people, data exfiltration in progress, safety-critical system failure, or EU AI Act serious incident	Immediate response; executive escalation within 1 hour
P2 — High	Harm occurred but contained; material data exposure; regulatory notification likely required	Acknowledge within 4 hours; investigation within 24 hours
P3 — Medium	Limited harm; single-user impact; no regulatory notification required	Investigate within 72 hours
P4 — Low	Near miss; no harm occurred; used for learning	Log and review in next security cycle

Phase 1: Detection and Triage

AI incidents surface through multiple channels—monitoring alerts, user reports, internal discovery, third-party disclosure, or media reporting. Detection readiness requires:

Monitoring infrastructure: Automated monitoring for behavioral anomalies (injection attempt patterns, anomalous tool call sequences, output format deviations, cross-tenant data signals) should be in place before an incident occurs. See AI Security Best Practices for monitoring signal guidance.

Intake channels: Users, employees, and security researchers need a documented way to report suspected AI incidents. Publish a responsible disclosure contact for external reporters. Maintain an internal incident intake form with fields for: suspected incident type, AI system involved, observed behavior, affected parties, and time of observation.

Triage criteria: On receiving a report, triage assigns a severity tier (P1–P4) and an incident type:

Security incident (prompt injection, data exfiltration, unauthorized access)
Safety incident (harmful output, discriminatory decision, dangerous recommendation)
Privacy incident (PII exposure, cross-tenant data leakage)
Reliability incident (significant model failure, erroneous output at scale)

Incident owner assignment: Every incident requires a named owner responsible for driving it to resolution. The incident owner is distinct from the engineer who investigates—the owner coordinates, communicates, and ensures the incident is not dropped.

Phase 2: Containment

Containment limits ongoing harm while investigation proceeds. AI-specific containment actions:

Isolate or throttle the affected system: For P1/P2 incidents, consider disabling the affected AI feature, routing traffic away from the affected model endpoint, or switching to a fallback configuration while the incident is investigated. The ability to roll back to a previous model version or system prompt is a containment prerequisite—establish rollback capability before deployment, not during an incident.

Preserve evidence: Before making any changes to the affected system, preserve:

Model version and configuration at time of incident
System prompt (if prompt change is a likely cause)
Relevant input/output logs surrounding the incident window
Tool call logs if the incident involves agentic behavior
Any retrieved content (RAG documents, tool outputs) that was processed during the incident

Evidence preservation is required for investigation and may be required for regulatory reporting. Do not modify or delete logs until the investigation is complete.

Limit harm propagation: If the incident involves data exposure, notify affected users or tenants promptly to enable them to take protective action. If the incident involves ongoing harmful outputs, add temporary content filters or rate limiting on the affected output path.

Notify stakeholders: Escalate P1/P2 incidents to executive leadership and legal/compliance teams immediately. Do not wait for investigation to be complete before internal notification.

Phase 3: Investigation

AI incident investigation requires different skills than conventional security incident response. Root causes may involve model behavior, training data, system prompt design, retrieval pipeline configuration, or the interaction between these layers.

Root cause analysis: Map the incident to its causal factors using topaithreats’ causal factor taxonomy. Common root causes:

Prompt injection vulnerability — system accepted adversarial input as instructions
Misconfigured deployment — default settings, excessive permissions, or missing controls
Insufficient safety testing — failure mode existed but was not discovered pre-deployment
Inadequate access controls — model or agent accessed data it should not have reached
Training data bias — systematic error caused by biased training data

Affected party identification: Determine who was harmed and to what extent. AI harms are often diffuse—a biased model may have affected many individuals in small ways without any single person knowing. Scope assessment requires examining decision logs, not only complaint records.

Timeline reconstruction: Build a chronological timeline from logs: when the vulnerability was introduced, when it became exploitable, when exploitation first occurred (if applicable), and when it was detected. This timeline is required for regulatory reporting and for preventing recurrence.

Reproduction: Attempt to reproduce the incident in a controlled environment to confirm the root cause hypothesis. Document the reproduction steps—these feed the findings register and inform remediation testing.

AI-Specific Investigation Playbooks

Each incident type requires a different investigation focus. Use the appropriate playbook based on the triage classification from Phase 1:

Incident Type	Primary Investigation Steps	Key Evidence	Specialist Required
Prompt injection	1. Identify injection vector (direct/indirect/cross-tool) 2. Determine if system prompt was extracted 3. Assess data exfiltration scope 4. Check if injected instructions persisted in memory/RAG	Input/output logs, tool call sequence, RAG retrieval logs	AppSec engineer
Data exfiltration	1. Identify what data was accessed 2. Trace the exfiltration path (tool calls, URLs, email sends) 3. Determine if exfiltration was injection-driven or misconfiguration-driven 4. Scope affected users/tenants	Agent tool call audit logs, network logs, tenant access logs	Security + ML Ops
Harmful/biased output	1. Collect affected outputs with demographic metadata 2. Run disparate impact analysis on historical outputs 3. Identify whether root cause is training data, system prompt, or retrieval content 4. Assess scope (how many decisions affected)	Output logs with demographic data, model version, training data provenance	Responsible AI / ML engineer
Hallucination causing harm	1. Identify the specific fabricated claim 2. Determine whether RAG grounding was active and what was retrieved 3. Check if the hallucination is reproducible or stochastic 4. Assess downstream actions taken on the hallucinated output	RAG retrieval logs, model output logs, downstream action records	ML engineer
Agent autonomy failure	1. Reconstruct the full tool call sequence 2. Identify where the agent exceeded its intended scope 3. Check if human approval gates were bypassed or absent 4. Determine if the failure was injection-driven or goal-drift	Tool call audit log with timestamps, permission grants, approval gate logs	ML engineer + Platform
Privacy/PII exposure	1. Identify what PII was exposed and to whom 2. Determine whether PII came from training data memorization, RAG retrieval, or cross-tenant leakage 3. Assess regulatory notification obligations 4. Scope: how many individuals’ data was exposed	Output logs, retrieval logs, tenant boundary audit	Privacy + Legal

For each playbook, the investigation should produce: confirmed root cause, scope of impact, timeline of exposure, and a remediation recommendation that feeds directly into Phase 4.

Phase 4: Remediation and Recovery

Remediation addresses the root cause identified in Phase 3. Common remediation types for AI incidents:

Root Cause	Remediation
Prompt injection	Privilege separation architecture, input validation, output filtering
Misconfigured deployment	Configuration audit, default settings hardening, permission review
Insufficient safety testing	Expanded red team coverage, additional test cases for the failed category
Training data bias	Dataset audit, retraining with balanced data, output monitoring for affected decision types
Excessive agent permissions	Reduce tool access scope, implement action allowlist, add human approval gates

Remediation verification: Re-test the specific failure scenario after remediation to confirm closure. For P1/P2 findings, also run a broader regression test to ensure remediation did not introduce new failures.

Residual risk acceptance: For incidents where full remediation is not feasible before re-enabling the affected system, document the residual risk explicitly. Residual risk requires sign-off from a product owner or risk committee—not the engineering team alone. The residual risk record feeds the organization’s AI risk register.

System re-enablement: Before re-enabling a disabled system, complete: root cause remediated (or residual risk accepted), remediation verified by re-test, affected parties notified (if required), and regulatory reporting completed (if required).

Phase 5: Reporting and Notification

AI incident reporting obligations vary by jurisdiction, industry, and incident severity. This phase has hard deadlines in some contexts.

EU AI Act — Serious Incident Reporting (Article 62): For high-risk AI systems under the EU AI Act, providers must report serious incidents (those that result in death, serious harm to health, significant disruption to critical infrastructure, or violation of fundamental rights) to the relevant national competent authority. The reporting obligation is triggered when the provider becomes aware of the incident. Consult legal counsel for specific notification timelines applicable to your jurisdiction.

GDPR/NIS2 data breach overlap: If an AI incident involves a personal data breach, GDPR Article 33 requires notification to the supervisory authority within 72 hours. If the incident affects network and information systems in scope of NIS2, NIS2 reporting obligations apply independently of the AI Act.

Internal reporting: Every incident, regardless of severity tier, requires a written incident report. The report documents: incident timeline, root cause, affected parties, containment actions, remediation applied, residual risk (if any), and lessons learned. This report updates the AI risk register and informs model card updates.

Reporting to external databases:

AIID (AI Incident Database): Submit incidents to the AI Incident Database at incidentdatabase.ai. The AIID accepts reports of AI-related harms from any submitter. Submission requires: a description of the incident, the AI system involved, the date, affected parties, and links to source documentation.

topaithreats: Submit incidents at the contributing page with supporting evidence, source links, and causal factor mapping.

Post-incident review: Schedule a post-incident review 1–2 weeks after resolution while details are fresh. Review agenda: what happened, why detection took the time it did, whether response procedures worked, and what process or technical changes prevent recurrence. Document outcomes and assign follow-up actions with owners and deadlines.

RACI: Roles and Responsibilities

Activity	Responsible	Accountable	Consulted	Informed
Incident detection and intake	Security / ML Ops	Incident Owner	Engineering	Leadership
Severity triage	Incident Owner	CISO / Risk Officer	Legal	Affected team leads
Containment	Engineering	Incident Owner	Security	Leadership (P1/P2)
Investigation	Engineering / ML	Incident Owner	Legal, Privacy	Risk Officer
Regulatory notification	Legal / Compliance	CISO	Incident Owner	Executive
External database submission	Security / Trust & Safety	Risk Officer	Legal	—
Post-incident review	Incident Owner	Risk Officer	All involved	Leadership

Incident Response Readiness Checklist

Before an incident occurs, verify:

AI incidents database — documented AI harm cases on topaithreats
AI Deployment Checklist — prevention controls that reduce incident likelihood
AI Red Teaming — proactive discovery of vulnerabilities before incidents occur
EU AI Act framework — regulatory obligations for high-risk AI systems
NIST AI Risk Management Framework — Respond and Govern functions

How to Build an AI Incident Response Plan