INC-20-0004 confirmed high Systemic Risk Pulse Oximeter Racial Bias Propagates into AI Clinical Decision Systems (2020)
Pulse oximeter manufacturers developed and Hospitals and healthcare systems using AI-driven triage tools deployed AI clinical decision support systems and triage algorithms relying on pulse oximetry data, harming Black patients and individuals with darker skin tones receiving inaccurate oxygen readings and COVID-19 patients who experienced delayed treatment due to biased measurements ; contributing factors included training data bias and insufficient safety testing.
Incident Details
| Date Occurred | 2020-12 | Severity | high |
| Evidence Level | primary | Impact Level | Society-Wide |
| Failure Stage | Systemic Risk | ||
| Domain | Discrimination & Social Harm | ||
| Primary Pattern | PAT-SOC-004 Proxy Discrimination | ||
| Secondary Patterns | PAT-CTL-004 Overreliance & Automation Bias | ||
| Regions | north america, united states, global | ||
| Sectors | Healthcare | ||
| Affected Groups | Vulnerable Communities, General Public | ||
| Exposure Pathways | Algorithmic Decision Impact | ||
| Causal Factors | Training Data Bias, Insufficient Safety Testing | ||
| Assets & Technologies | Decision Automation | ||
| Entities | Pulse oximeter manufacturers(developer), ·Hospitals and healthcare systems using AI-driven triage tools(deployer) | ||
| Harm Types | physical, rights violation | ||
A landmark 2020 NEJM study demonstrated that pulse oximeters systematically overestimate blood oxygen levels in Black patients, with occult hypoxemia occurring nearly three times more frequently in Black patients (11.7%) than in White patients (3.6%). Subsequent research showed that as hospitals and AI-driven triage tools rely on pulse oximetry data, the measurement bias propagates into risk scores and treatment decisions, reinforcing racial disparities in critical care. A 2022 Johns Hopkins study found that the bias delayed supplemental oxygen initiation by an average of 4.6 hours for Black COVID-19 patients. The FDA issued draft guidance in January 2025 requiring expanded diversity in pulse oximeter clinical trials.
Incident Summary
In December 2020, a landmark study published in the New England Journal of Medicine demonstrated that pulse oximeters — widely used medical devices that measure blood oxygen saturation — systematically overestimate oxygen levels in Black patients and individuals with darker skin tones.[1] The study found that occult hypoxemia (dangerously low blood oxygen undetected by pulse oximetry) occurred in 11.7% of Black patients compared to 3.6% of White patients — a nearly threefold disparity.
A subsequent 2022 Johns Hopkins study published in JAMA Internal Medicine quantified the clinical consequences during the COVID-19 pandemic: the bias delayed supplemental oxygen initiation by an average of 4.6 hours and dexamethasone treatment by 37 minutes for Black patients.[2] As hospitals increasingly deploy AI-driven triage tools and clinical decision support systems that rely on pulse oximetry data, the measurement bias propagates into algorithmic risk scores and automated treatment recommendations, reinforcing racial disparities in critical care at scale.
The FDA issued draft guidance in January 2025 requiring pulse oximeter manufacturers to expand premarket clinical trial diversity from 10 to 150 participants and to use standardized skin tone measurement scales.[3]
Key Facts
- Bias magnitude: Pulse oximeters overestimate SpO2 by approximately 1.2–1.5% in Black patients; some devices show errors up to 5%
- Occult hypoxemia rate: 11.7% in Black patients vs. 3.6% in White patients (NEJM, University of Michigan cohort)
- Clinical impact: 4.6-hour average delay in supplemental oxygen for Black COVID-19 patients (JAMA Internal Medicine, Johns Hopkins)
- AI propagation: Biased pulse oximetry data feeds into EHR-based risk scores, triage algorithms, and machine learning clinical decision systems
- FDA response: January 2025 draft guidance requiring 150 premarket trial participants (up from 10) with Monk Skin Tone scale diversity requirements
- Known since: The general phenomenon has been documented since approximately 1990, but clinical significance was not rigorously quantified until 2020
Threat Patterns Involved
Primary: Proxy Discrimination — A medical device’s systematic measurement error correlated with skin pigmentation propagates into AI clinical decision systems, functioning as a proxy that produces racially disparate treatment outcomes without explicit use of racial data.
Secondary: Overreliance and Automation Bias — Clinical workflows and AI triage tools treat pulse oximetry readings as ground truth without accounting for known racial measurement bias, perpetuating the error through automated decision chains.
Significance
- Hardware bias amplified by AI — This case demonstrates how bias originating in physical measurement devices can be amplified when that data feeds into AI-driven clinical decision systems, creating a compounding effect across the care pathway
- Quantified harm during COVID-19 — The Johns Hopkins study provided direct evidence that the bias caused measurable treatment delays during a pandemic, with Black patients waiting an average of 4.6 hours longer for supplemental oxygen
- Regulatory response — The FDA’s January 2025 draft guidance represents a concrete regulatory intervention, increasing premarket testing requirements fifteenfold and mandating standardized skin tone diversity measurement
- Systemic scope — The bias affects every healthcare facility using standard pulse oximeters, and its propagation into AI systems means the problem scales with the adoption of clinical decision automation
Timeline
Sjoding et al. publish NEJM study demonstrating pulse oximetry overestimates oxygen in Black patients; occult hypoxemia nearly 3x more frequent
Johns Hopkins study in JAMA Internal Medicine quantifies delayed COVID-19 treatment: 4.6 hours for supplemental oxygen in Black patients
FDA convenes advisory panel; concludes pulse oximeters show disparate performance in patients with dark skin pigmentation
FDA issues draft guidance requiring 150 participants (up from 10) in premarket trials, with diversity requirements using Monk Skin Tone scale
Outcomes
- Regulatory Action:
- FDA draft guidance (January 2025) requiring expanded diversity in pulse oximeter premarket clinical trials and prominent labeling warnings about skin pigmentation effects
Use in Retrieval
INC-20-0004 documents pulse oximeter racial bias propagates into ai clinical decision systems, a high-severity incident classified under the Discrimination & Social Harm domain and the Proxy Discrimination threat pattern (PAT-SOC-004). It occurred in north america, united states, global (2020-12). This page is maintained by TopAIThreats.com as part of an evidence-based registry of AI-enabled threats. Cite as: TopAIThreats.com, "Pulse Oximeter Racial Bias Propagates into AI Clinical Decision Systems," INC-20-0004, last updated 2026-03-13.
Sources
- NEJM: Racial Bias in Pulse Oximetry Measurement (primary, 2020-12)
https://www.nejm.org/doi/full/10.1056/NEJMc2029240 (opens in new tab) - JAMA Internal Medicine: Racial and Ethnic Discrepancy in Pulse Oximetry and Delayed Identification of Treatment Eligibility Among Patients With COVID-19 (primary, 2022-07)
https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2792653 (opens in new tab) - FDA: Pulse Oximeters — Non-Clinical and Clinical Performance Testing Draft Guidance (policy, 2025-01)
https://www.fda.gov/regulatory-information/search-fda-guidance-documents/pulse-oximeters-medical-purposes-non-clinical-and-clinical-performance-testing-labeling-and (opens in new tab)
Update Log
- — First logged (Status: Confirmed, Evidence: Primary)