Training Data Bias
Why AI Threats Occur
Referenced in 13 of 97 documented incidents (13%) · 2 critical · 7 high · 4 medium · 2013–2025
Systematic errors in AI outputs caused by biased, unrepresentative, or historically discriminatory training data that encodes and amplifies societal inequities.
| Code | CAUSE-005 |
| Category | Design & Development |
| Lifecycle | Design |
| Control Domains | Data governance, Fairness & ethics, Data quality |
| Likely Owner | Data / Responsible AI |
| Incidents | 13 (13% of 97 total) · 2013–2025 |
Definition
The mechanism is straightforward: machine learning models learn patterns from their training data, and when that data reflects historical discrimination — in hiring, lending, criminal justice, housing, or other domains — the model reproduces and often amplifies those discriminatory patterns in its outputs.
Bias enters AI systems through four documented pathways:
| Bias Pathway | Entry Mechanism | Example |
|---|---|---|
| Sampling bias | Training data underrepresents or excludes specific populations | Facial recognition trained primarily on light-skinned faces misidentifies darker-skinned individuals |
| Label bias | Human annotations encode stereotypes or subjective judgments | Annotators rate identical résumés differently based on gendered names |
| Historical bias | Data accurately reflects a discriminatory past that should not be perpetuated | Hiring model learns from decade of male-dominated tech hiring (INC-18-0002) |
| Proxy discrimination | Seemingly neutral features correlate with protected characteristics | Zip code as a lending feature correlates with race due to historical segregation |
Once embedded in a model, these biases can be amplified through feedback loops where biased outputs influence future training data.
Why This Factor Matters
Training data bias has produced some of the most consequential AI discrimination incidents documented in the database. The Dutch childcare benefits scandal (INC-13-0001) used an algorithm that disproportionately flagged families with dual nationality for fraud investigation, resulting in thousands of families losing benefits, forced repayments, and ultimately the resignation of the Dutch government. The COMPAS recidivism algorithm (INC-16-0003) was found to systematically assign higher risk scores to Black defendants than white defendants with similar criminal histories, influencing pretrial detention and sentencing decisions affecting thousands of individuals.
These incidents demonstrate that training data bias does not merely produce inaccurate outputs — it systematically disadvantages specific populations along the same lines of historical discrimination that the data reflects. The Amazon AI recruiting tool (INC-18-0002) learned to penalize resumes containing the word “women’s” because its training data reflected a decade of male-dominated hiring patterns. Meta’s housing ad algorithm (INC-22-0002) discriminated along racial lines in ad delivery, reproducing segregation patterns from the training data.
This factor persists because bias is not a defect that can be removed through technical fixes alone — it requires deliberate attention to data representativeness, ongoing monitoring with disaggregated metrics, and institutional commitment to fairness that extends beyond model accuracy.
How to Recognize It
Demographic underperformance for specific population groups or protected classes. AI systems that perform well on aggregate metrics but fail disproportionately for specific demographics indicate training data bias. The Rite Aid facial recognition system (INC-23-0013) disproportionately misidentified women and people of color as shoplifters, leading to FTC enforcement action banning the company from using facial recognition technology.
Historical discrimination encoding reproduced and amplified in model outputs. Models trained on historical data perpetuate past discrimination. Amazon’s hiring AI (INC-18-0002) penalized female applicants because the training data reflected historical gender imbalance in tech hiring. The COMPAS algorithm (INC-16-0003) reproduced racial disparities in the criminal justice system because its training data encoded those disparities.
Bias amplification loops reinforcing existing inequities through feedback cycles. When biased AI outputs influence future data collection, bias compounds over time. Predictive policing systems that direct patrols to historically over-policed communities generate more arrests in those communities, which feeds back into the model as validation of its predictions — regardless of actual crime rates.
Disproportionate error rates across protected characteristics in decision systems. The Google Gemini image generation controversy (INC-24-0009) demonstrated how overcorrection for bias can itself produce inaccurate outputs, generating historically inaccurate images in an attempt to increase demographic diversity — revealing the difficulty of addressing bias without introducing new distortions.
Unrepresentative training data missing or undersampling affected populations. When training data does not adequately represent the populations the system will serve, the model performs poorly for underrepresented groups. This is particularly harmful in healthcare, where models trained primarily on data from one demographic may misdiagnose or underserve others.
Cross-Factor Interactions
Model Opacity (CAUSE-008): Training data bias and model opacity form a particularly harmful combination. The Dutch childcare benefits algorithm (INC-13-0001) operated as a black box — affected families could not understand why they were flagged, and the discriminatory weighting of nationality as a risk factor was not discoverable through the system’s outputs alone. The COMPAS algorithm (INC-16-0003) similarly resisted audit because its proprietary scoring methodology was not transparent. When biased models cannot be inspected, the bias operates silently.
Insufficient Safety Testing (CAUSE-006): Models deployed without bias testing across demographic groups will exhibit training data bias that could have been identified and mitigated pre-deployment. Amazon’s hiring tool (INC-18-0002) operated for several years before the gender bias was identified and the project was scrapped. Disaggregated performance evaluation during safety testing would have revealed the bias before deployment.
Mitigation Framework
Organizational Controls
- Conduct demographic parity analysis across model outputs before deployment, comparing performance metrics across protected characteristics
- Implement ongoing bias monitoring with disaggregated performance metrics that track accuracy, false positive rates, and false negative rates separately for each demographic group
- Establish feedback mechanisms for affected communities to report suspected bias, with clear escalation pathways to model review teams
Technical Controls
- Curate training data with explicit attention to representativeness and historical bias, documenting data sources, sampling methodologies, and known limitations
- Apply fairness constraints during model training (demographic parity, equalized odds, calibration) appropriate to the deployment context
- Implement bias detection tools that automatically flag statistically significant performance disparities across demographic groups
- Use data augmentation and re-sampling techniques to address underrepresentation in training data
Monitoring & Detection
- Track disaggregated performance metrics in production, comparing real-world outcomes across demographic groups over time
- Implement regular bias audits — both internal and independent third-party — with published findings
- Monitor for bias amplification through feedback loops, particularly in systems where outputs influence future training data
- Establish triggers for model retraining or retirement when bias metrics exceed acceptable thresholds
Lifecycle Position
Training data bias is introduced during the Design phase through choices about training data sources, sampling methodologies, and labeling processes. These design decisions determine which biases are embedded in the model — and once embedded, biases are difficult to fully remove through post-hoc corrections. The most effective mitigation is careful data curation during the design phase, with explicit attention to representativeness across the populations the system will affect.
Post-deployment bias emerges when the deployment context differs from the training data distribution — a model trained on data from one geographic or demographic context may exhibit bias when deployed to serve a different population. Ongoing monitoring is required to detect this distributional shift.
Regulatory Context
The EU AI Act addresses training data bias directly in Article 10, which requires that training, validation, and testing datasets for high-risk AI systems be “relevant, sufficiently representative, and to the best extent possible, free of errors and complete.” This is the most specific regulatory requirement for training data quality in any jurisdiction. NIST AI RMF addresses bias under the MAP and MEASURE functions, requiring organizations to identify sources of bias and evaluate AI system fairness across demographic groups. The EEOC has issued guidance that employers using AI hiring tools are responsible for ensuring those tools do not discriminate against protected groups, regardless of whether the discrimination was intended. ISO 42001 requires AI management systems to address fairness and non-discrimination as core AI risk categories.
Use in Retrieval
This page targets queries about AI training data bias, algorithmic bias, AI discrimination, dataset bias, why AI is biased, fairness in AI, demographic bias, and bias amplification. It covers how bias enters AI systems (sampling bias, label bias, historical bias, proxy discrimination), documented discrimination incidents across employment, housing, criminal justice, and government services, and mitigation approaches (demographic parity analysis, disaggregated metrics, fairness constraints, data curation). For related patterns, see data imbalance bias and proxy discrimination. For the opacity that prevents bias discovery, see model opacity.
Incident Record
13 documented incidents involve training data bias as a causal factor, spanning 2013–2025.
Co-occurring causal factors
Related Causal Factors