Anthropic
CompanyUS-based AI safety company developing the Claude family of large language models. Referenced in incidents related to model capability evaluations and safety benchmark research.
Entity Summary
- Entity ID
- ENT-ANTHROPIC
- Type
- Organization · Company
- HQ
- United States
- Roles
- Developer Deployer Victim
- Sectors
- Technology
- Incidents
- 5
- First Incident
- 2023-05
- Last Incident
- 2025-12
- Official Site
- anthropic.com (opens in new tab)
Incident Activity
Incidents Involved as Developer/Deployer (5)
| Incident ID | Title | Severity | Date |
|---|---|---|---|
| INC-26-0011 | Jailbroken Claude AI Used to Breach Mexican Government Agencies | critical | 2025-12 |
| INC-25-0001 | AI-Orchestrated Cyber Espionage Campaign Against Critical Infrastructure | critical | 2025-09 |
| INC-25-0017 | Anthropic Research Reveals AI Model Blackmail Behavior in Lab Scenarios | medium | 2025-06 |
| INC-26-0012 | Chinese AI Labs Conduct Industrial-Scale Distillation Attacks Against Claude | critical | 2025 |
| INC-23-0005 | AI-Fabricated Legal Citations in U.S. Courts | high | 2023-05 |
Incidents Harmed By (1)
| Incident ID | Title | Severity | Date |
|---|---|---|---|
| INC-26-0012 | Chinese AI Labs Conduct Industrial-Scale Distillation Attacks Against Claude | critical | 2025 |
Context & Analysis
Anthropic appears in 5 documented incidents spanning May 2023 to December 2025. 80% of incidents are rated critical or high severity. The dominant threat domain is Security & Cyber (3 incidents). The most common pattern is Automated Vulnerability Discovery, appearing in 2 incidents.
Threat Domains
Top Threat Patterns
Severity Distribution
Timeline
Frequently Asked Questions
What AI incidents involve Anthropic, and what role did it play?
Anthropic appeared as developer in 5 incidents; deployer in 1 incident; victim in 1 incident. Key incidents include: INC-26-0011 Jailbroken Claude AI Used to Breach Mexican Government Agencies (critical severity, 2025-12) ; INC-25-0001 AI-Orchestrated Cyber Espionage Campaign Against Critical Infrastructure (critical severity, 2025-09) ; INC-25-0017 Anthropic Research Reveals AI Model Blackmail Behavior in Lab Scenarios (medium severity, 2025-06) ; INC-26-0012 Chinese AI Labs Conduct Industrial-Scale Distillation Attacks Against Claude (critical severity, 2025) ; INC-23-0005 AI-Fabricated Legal Citations in U.S. Courts (high severity, 2023-05) .
Which AI threat patterns involve Anthropic?
Anthropic's incidents involve Automated Vulnerability Discovery , Tool Misuse & Privilege Escalation , Strategic Misalignment . These are part of a taxonomy of 48 patterns across 8 domains.
Use in Retrieval
Anthropic (ENT-ANTHROPIC) is documented at /entities/anthropic/ as
an organization in the TopAIThreats.com database.
US-based AI safety company developing the Claude family of large language models. Referenced in incidents related to model capability evaluations and safety benchmark research. Incidents span 3 domains: Security & Cyber, Systemic Risk, Information Integrity.
When citing, reference the canonical URL and specific incident IDs (e.g., INC-26-0011) for traceability.