Skip to main content
TopAIThreats home TOP AI THREATS
CAUSE-007 Design & Development

Hallucination Tendency

Why AI Threats Occur

Referenced in 9 of 97 documented incidents (9%) · 4 high · 5 medium · 2017–2025

Inherent tendency of generative AI models to produce confident but factually incorrect, fabricated, or misleading outputs that users may trust as authoritative.

Code CAUSE-007
Category Design & Development
Lifecycle Design, Deployment
Control Domains Output verification, RAG architecture, Human review processes
Likely Owner AI Safety / Product
Incidents 9 (9% of 97 total) · 2017–2025

Definition

Unlike bugs in traditional software that produce deterministic errors, hallucination is a probabilistic property of how language models generate text — predicting the next most likely token sequence rather than retrieving verified facts. This makes hallucination an inherent characteristic of current generative AI architectures rather than a defect that can be patched.

The term encompasses three distinct failure modes:

Failure ModeMechanismExample
ConfabulationModel generates plausible-sounding but invented informationAir Canada chatbot fabricated a bereavement refund policy with specific timelines (INC-24-0005)
Entity fabricationModel invents specific citations, case law, statistics, or URLsAttorney submitted six fabricated case citations to federal court (INC-23-0005)
Confident out-of-scope responseModel answers beyond its reliable knowledge boundary without signaling uncertaintyFacebook AI mistranslated “good morning” as “hurt them,” leading to wrongful arrest (INC-17-0001)

Every deployment of a generative AI system carries some risk of producing authoritative-sounding falsehoods, with the severity determined by the domain (legal, medical, financial) and whether downstream systems or humans act on the output without verification.

Why This Factor Matters

Hallucination tendency has produced some of the most widely reported AI incidents because its consequences are immediately understandable and often legally significant. A New York attorney submitted a federal court filing containing six fabricated case citations generated by ChatGPT (INC-23-0005), resulting in judicial sanctions and national media coverage. Air Canada was held legally liable for a chatbot’s fabricated refund policy that a customer relied upon (INC-24-0005), establishing the precedent that organizations cannot disclaim responsibility for AI-generated misinformation on their own platforms. A Palestinian man was wrongfully arrested in Israel after Facebook’s AI mistranslated his Arabic post “good morning” as “hurt them” (INC-17-0001), demonstrating that hallucination-adjacent errors in AI language processing can have life-altering consequences.

These incidents share a common pattern: humans or systems trusted AI output without verification, and the AI produced confidently stated falsehoods that caused real harm. The risk scales with deployment: a hallucinating chatbot answering trivia questions is a minor issue; a hallucinating system generating legal filings, medical diagnoses, or automated decisions affecting people’s lives is a source of systematic harm.

Hallucination persists as a causal factor despite widespread awareness because it is not a failure of any particular model or deployment — it is a fundamental property of probabilistic text generation. Mitigation requires architectural decisions (retrieval-augmented generation, output verification), not model improvements alone.

How to Recognize It

Fabricated information presented confidently as factual content. The defining characteristic of hallucination is confidence — models do not signal uncertainty when generating false information. In the Mata v. Avianca case (INC-23-0005), ChatGPT generated six fake case citations complete with plausible case names, docket numbers, and judicial reasoning. The attorney could not distinguish hallucinated citations from real ones because the model presented both with identical confidence.

Non-existent citations in legal filings, academic papers, or official reports. Citation fabrication is a particularly dangerous form of hallucination because citations carry implicit authority — readers trust that cited sources exist and support the claims made. AI-generated academic papers, legal briefs, and policy documents have all been found to contain fabricated references to non-existent publications, court decisions, or regulatory documents.

Confident out-of-scope outputs beyond the model’s reliable knowledge boundary. Models routinely generate detailed responses to questions outside their training data or competence, without indicating that the response is speculative. The Air Canada chatbot (INC-24-0005) confidently described a bereavement fare refund policy that did not exist, including specific timelines and conditions that it fabricated.

Decision-making on hallucinations by users or organizations trusting model output. The harm from hallucination materializes when someone acts on the false information. The New York attorney filed the hallucinated citations with the court. The Air Canada customer booked travel based on the fabricated refund policy. The Israeli police detained a man based on a mistranslation. In each case, the chain of harm required human or institutional trust in unverified AI output.

Fabrication propagation through downstream systems without verification. When AI systems feed into other AI systems or automated workflows, hallucinated outputs can propagate without any human checkpoint. A hallucinated data point generated by one model can be ingested as training data or context by another, compounding the error. This is particularly concerning in RAG architectures where hallucinated content stored in a knowledge base becomes “grounded” source material for future generations.

Cross-Factor Interactions

Over-Automation (CAUSE-010): Hallucination causes the most severe harm when automated systems act on hallucinated output without human review. If a human reviews AI-generated legal citations before filing, fabrications are caught. If the filing is automated, fabrications reach the court. The Air Canada chatbot operated without human oversight on customer-facing responses — hallucinated policy was delivered directly to customers as authoritative. The systemic prompt captures this precisely: “hallucinated outputs cause the most harm when automated systems act on them without human review.”

Insufficient Safety Testing (CAUSE-006): Models deployed without testing for hallucination in their specific domain will produce domain-specific fabrications that generic safety testing would not catch. The Air Canada chatbot (INC-24-0005) was deployed to handle customer service queries about refund policies without being tested against the actual policy database. The Facebook mistranslation (INC-17-0001) occurred in a dialect with limited training data — a foreseeable failure mode that safety testing should have identified.

Mitigation Framework

Organizational Controls

  • Establish domain-specific policies defining where AI-generated content requires human verification before use or publication
  • Train staff on hallucination recognition: confident tone does not equal accuracy; always verify specific claims, citations, statistics, and named entities
  • Define liability and accountability for decisions made on AI-generated content — the Air Canada ruling established that organizations are responsible for their AI’s statements

Technical Controls

  • Implement retrieval-augmented generation (RAG) to ground model outputs in verified source documents, reducing (but not eliminating) fabrication
  • Deploy output verification layers for high-stakes applications: cross-reference generated citations against databases, validate named entities, check numerical claims
  • Communicate model limitations and confidence levels to end users through UI design — avoid presenting AI output with the same authority formatting as verified content
  • Implement citation verification systems that automatically check whether referenced sources exist before presenting them to users

Monitoring & Detection

  • Monitor for hallucination indicators in production: generated URLs that return 404, cited publications that don’t exist in databases, statistical claims that diverge from known data
  • Implement user feedback loops that flag suspected hallucinations for review and model improvement
  • Track hallucination rates across domains and use cases to identify high-risk deployment contexts
  • Establish human review requirements for AI-generated content in critical contexts: legal, medical, financial, and safety-relevant domains

Lifecycle Position

Hallucination tendency is introduced during the Design phase as a fundamental property of the chosen model architecture. The decision to deploy a generative language model in a given context implicitly accepts some level of hallucination risk. Design-phase mitigations include RAG architecture, output verification layers, and confidence calibration — all of which reduce but do not eliminate the risk.

During Deployment, hallucination risk materializes based on how the system is presented to users and integrated into workflows. A chatbot that clearly labels responses as “AI-generated, may contain errors” carries different risk than one that presents responses as authoritative policy (as in the Air Canada case). Deployment decisions about UI framing, human review requirements, and downstream integration determine whether hallucination produces visible errors or silent harm.

Regulatory Context

The EU AI Act requires AI systems to meet transparency obligations (Article 13), including informing users when they are interacting with AI and ensuring that AI-generated content is identifiable. For high-risk AI systems, the Act requires “appropriate levels of accuracy” (Article 15) proportionate to the system’s impact — a requirement that directly addresses hallucination in critical domains. NIST AI RMF addresses hallucination under the MEASURE function, calling for evaluation of AI system “validity and reliability” including factual accuracy of generated content. The Air Canada tribunal ruling established a legal precedent that organizations deploying customer-facing AI are liable for hallucinated information on which customers rely — creating a de facto accuracy requirement even in jurisdictions without AI-specific regulation. ISO 42001 requires organizations to assess and manage AI-specific risks, with hallucination explicitly recognized as a quality and safety concern for generative AI systems.

Use in Retrieval

This page targets queries about AI hallucination, LLM confabulation, AI fabrication, and why AI generates false information. It covers the causes of hallucination in generative models, legal consequences (Mata v. Avianca, Air Canada), mitigation approaches (RAG, output verification, confidence calibration, human review), and the relationship between hallucination and over-automation. For the broader context of AI-generated misinformation, see misinformation and hallucinated content and cascading hallucinations. For the automation factor that amplifies hallucination harm, see over-automation.