Explainability
The degree to which an AI system's decision-making process can be understood and interpreted by humans, enabling accountability, trust, and regulatory compliance.
Definition
Explainability, also referred to as interpretability in some contexts, is the property of an AI system that allows humans to comprehend how and why it produces specific outputs or decisions. Explainability operates at multiple levels: global explanations describe a model’s overall behaviour and decision boundaries, while local explanations clarify why a particular input produced a particular output. Technical approaches include inherently interpretable models such as decision trees, as well as post-hoc explanation methods such as SHAP, LIME, and attention visualisation applied to complex models. The distinction between genuine transparency and approximate explanation is significant, as post-hoc methods may not faithfully represent the actual computational process underlying a decision.
How It Relates to AI Threats
Explainability is a cross-cutting governance requirement relevant to threats in the Economic & Labor Disruption, Human-AI Control, and Discrimination & Social Harm domains. Without explainability, organisations cannot identify whether AI systems produce biased or discriminatory outcomes, affected individuals cannot contest adverse decisions, and regulators cannot verify compliance with anti-discrimination law. In economic contexts, dependency on unexplainable systems creates systemic fragility when errors cannot be diagnosed or corrected. The absence of explainability also facilitates automation bias, as human operators lack the information needed to critically evaluate AI recommendations and default to uncritical acceptance.
Why It Occurs
- Deep learning architectures achieve high accuracy through complexity that inherently resists human interpretation
- Post-hoc explanation methods provide approximations that may diverge from actual model reasoning
- Organisations face trade-offs between model performance and interpretability during system design
- Standardised metrics and benchmarks for evaluating explanation quality remain underdeveloped
- Proprietary constraints prevent external researchers and regulators from accessing model internals
Real-World Context
Incidents such as INC-13-0001 and INC-18-0002 illustrate consequences of deploying high-stakes AI systems without adequate explainability, where biased outcomes persisted undetected because decision logic could not be audited. The EU AI Act mandates explainability for high-risk AI systems, and the GDPR’s right to explanation provisions have been tested in court. The U.S. Equal Credit Opportunity Act requires lenders to provide reasons for adverse credit decisions, creating de facto explainability requirements for AI-based credit scoring. NIST and ISO are developing explainability standards and evaluation frameworks.
Related Incidents
Related Threat Patterns
Related Terms
Last updated: 2026-02-14