Automation Bias

Definition

Automation bias is a cognitive tendency in which human operators place excessive trust in the outputs of automated or AI systems, accepting algorithmic recommendations without sufficient independent verification. This bias manifests in two primary forms: errors of commission, where operators follow an incorrect automated recommendation; and errors of omission, where operators fail to notice a problem because the automated system did not flag it. Automation bias is distinct from general overreliance in that it specifically describes a systematic decision-making pattern rooted in the perceived authority and consistency of automated systems, and it has been studied extensively in aviation, healthcare, and public administration.

How It Relates to AI Threats

Automation bias is a core failure mode within the Human-AI Control domain, directly contributing to overreliance on AI outputs and rendering human-in-the-loop safeguards ineffective. When reviewers systematically defer to algorithmic recommendations, the intended function of human oversight — catching errors, exercising contextual judgement, and applying ethical reasoning — is functionally nullified. Within Discrimination & Social Harm, automation bias amplifies the impact of algorithmic bias: if a biased model consistently produces discriminatory outputs and human reviewers routinely accept those outputs, the discriminatory patterns persist uncorrected through the decision pipeline.

Why It Occurs

Automated systems are perceived as objective, consistent, and less prone to error than human judgement
High-volume decision environments create time pressure that discourages independent verification
Operators often lack visibility into how algorithmic recommendations are generated
Organisational incentives may reward throughput over the quality of individual case review
Training and interface design frequently fail to encourage critical evaluation of automated outputs

Real-World Context

Australia’s Robodebt scheme (INC-16-0001) exemplifies automation bias at institutional scale. Caseworkers relied on an automated income-averaging algorithm to issue debt notices, despite evidence that the system produced inaccurate assessments. The human review process was largely perfunctory, with operators defaulting to the system’s determinations. The resulting harm — wrongful debt demands affecting hundreds of thousands of individuals, including documented cases of severe distress — was sustained in part because the institutional culture treated algorithmic outputs as authoritative. The Royal Commission into the scheme identified the failure of meaningful human oversight as a central contributing factor.

Definition

How It Relates to AI Threats

Why It Occurs

Real-World Context

Related Incidents

Related Threat Patterns

Related Terms