AI Capability

Large Language Model

A neural network trained on massive text datasets to generate, summarise, and reason about natural language.

Definition

A large language model (LLM) is a neural network trained on massive text datasets to generate, summarise, translate, and reason about natural language. LLMs form the foundation of systems such as ChatGPT, Claude, and Gemini. They operate by predicting the most likely next tokens in a sequence, which enables fluent text generation but also produces confident-sounding outputs that may be factually incorrect (hallucinations).

How It Relates to AI Threats

LLMs intersect with threats across multiple domains. Within Information Integrity, they enable the production of misinformation and hallucinated content at scale. Within Human-AI Control, they create risks of overreliance and automation bias as users treat LLM outputs as authoritative. LLMs also underpin agentic AI systems, where autonomous action introduces additional risk vectors.

Why It Occurs

Scale of training data includes both accurate and inaccurate information
The prediction mechanism optimises for plausibility rather than factual accuracy
Users frequently lack the ability to verify LLM outputs
Commercial deployment incentivises broad capability over narrow reliability
Rapid adoption has outpaced the development of appropriate governance frameworks

Real-World Context

LLM-related incidents include Samsung engineers leaking proprietary code via ChatGPT (INC-23-0002), Italy’s temporary GDPR-based ban on ChatGPT (INC-23-0003), a lawyer citing hallucinated case law in federal court (INC-23-0005), and AI-generated phishing attacks leveraging LLM fluency (INC-23-0006).

Related Incidents

INC-23-0002 high 2023-03

Samsung Semiconductor Trade Secret Leak via ChatGPT

INC-23-0003 medium 2023-03

Italy Temporary Ban on ChatGPT for GDPR Violations

INC-23-0005 high 2023-05

AI-Fabricated Legal Citations in U.S. Courts

INC-23-0006 high 2023-07

WormGPT: AI-Powered Business Email Compromise Tool

INC-26-0011 critical 2025-12

Jailbroken Claude AI Used to Breach Mexican Government Agencies

INC-26-0009 critical 2025-04

DOGE Uses ChatGPT to Flag and Cancel Federal Humanities Grants

INC-26-0012 critical 2025

Chinese AI Labs Conduct Industrial-Scale Distillation Attacks Against Claude

INC-24-0007 high 2024-01

Indirect Prompt Injection Attacks on LLM-Integrated Applications

INC-23-0012 medium 2023-08

Zoom AI Training Terms of Service Controversy

INC-23-0014 high 2023-01

GitHub Copilot Reproduces Verbatim Training Data Including Secrets

INC-16-0002 high 2016-03

Microsoft Tay Twitter Chatbot Adversarial Manipulation

Related Threat Patterns

Misinformation & Hallucinated Content Information Integrity Overreliance & Automation Bias Human-AI Control

Related Terms

Hallucination Training Data Agentic AI

← Back to Glossary → Taxonomy → All Domains

Last updated: 2026-02-14