People Are Poorly Equipped To Detect Ai-powered Voice Clones
2024 Β· Sarah Barrington, Emily A. Cooper, Hany Farid
Abstract
As generative artificial intelligence (AI) continues its ballistic trajectory, everything from text to audio, image, and video generation continues to improve at mimicking human-generated content. Through a series of perceptual studies, we report on the realism of AI-generated voices in terms of identity matching and naturalness. We find human participants cannot consistently identify recordings of AI-generated voices. Specifically, participants perceived the identity of an AI-voice to be the same as its real counterpart approximately 80% of the time, and correctly identified a voice as AI generated only about 60% of the time.
Authors
(none)
Tags
Stats
Related papers
- Securing Voice-driven Interfaces Against Fake (cloned) Audio Attacks (2019)9.92
- Defense Against Synthetic Speech: Real-time Detection Of RVC Voice Conversion Attacks (2025)0.00
- Single And Multi-speaker Cloned Voice Detection: From Perceptual To Learned Features (2023)9.23
- Can We Steal Your Vocal Identity From The Internet?: Initial Investigation Of Cloning Obama's Voice Using GAN, Wavenet And Low-quality Found Data (2018)12.02
- Neural Voice Cloning With A Few Samples (2018)0.00
- Detection Of Ai-synthesized Speech Using Cepstral & Bispectral Statistics (2020)0.00
- Voice Impersonation Using Generative Adversarial Networks (2018)13.23
- Adversarial Attacks On Audio Deepfake Detection: A Benchmark And Comparative Study (2025)0.00