Skip to main content
TopAIThreats home TOP AI THREATS
Technical Attack

Voice Cloning

AI technology that replicates a specific individual's voice to generate realistic synthetic speech.

Definition

Voice cloning is the use of AI to replicate a specific individual’s voice from audio samples, enabling the generation of speech that closely mimics the original speaker’s tone, cadence, and pronunciation. Modern voice cloning systems can produce convincing replicas from only a few seconds of sample audio. The technology is used legitimately in accessibility and entertainment, but is increasingly exploited for fraud, impersonation, and social engineering.

How It Relates to AI Threats

Voice cloning is a primary enabler of deepfake-based impersonation within the Information Integrity domain. It powers social engineering attacks classified under Security & Cyber, including CEO fraud, grandparent scams, and vishing campaigns where attackers impersonate trusted contacts by phone. The combination of voice cloning with real-time generation makes telephone-based fraud particularly difficult to detect.

Why It Occurs

  • Commercial and open-source voice synthesis tools have become widely accessible
  • Modern models require minimal training data — sometimes only seconds of audio
  • Real-time voice conversion enables live impersonation during phone calls
  • Voice authentication systems are not widely deployed
  • Social norms around telephone communication create inherent trust in familiar voices

Real-World Context

Voice cloning has been documented in the UK energy CEO fraud (INC-19-0001), Hong Kong CFO video conference fraud (INC-24-0001), Newfoundland grandparent scam (INC-23-0004), and FBI deepfake impersonation warnings (INC-23-0001). The FBI Elder Fraud Report (INC-24-0004) specifically identifies AI-enhanced voice cloning as a growing vector in elder fraud schemes.

Last updated: 2026-02-14