A Survey On Speech Deepfake Detection
2024 Β· Menglu Li, Yasaman Ahmadiadli, Xiao-Ping Zhang
Abstract
The availability of smart devices leads to an exponential increase in multimedia content. However, advancements in deep learning have also enabled the creation of highly sophisticated Deepfake content, including speech Deepfakes, which pose a serious threat by generating realistic voices and spreading misinformation. To combat this, numerous challenges have been organized to advance speech Deepfake detection techniques. In this survey, we systematically analyze more than 200 papers published up to March 2024. We provide a comprehensive review of each component in the detection pipeline, including model architectures, optimization techniques, generalizability, evaluation metrics, performance comparisons, available datasets, and open source availability. For each aspect, we assess recent progress and discuss ongoing challenges. In addition, we explore emerging topics such as partial Deepfake detection, cross-dataset evaluation, and defences against adversarial attacks, while suggesting p
Authors
(none)
Tags
Stats
Related papers
- Deepfake: Definitions, Performance Metrics And Standards, Datasets And Benchmarks, And A Meta-review (2022)11.85
- Diffuse Or Confuse: A Diffusion Deepfake Speech Dataset (2024)5.24
- Adversarial Attacks On Audio Deepfake Detection: A Benchmark And Comparative Study (2025)0.00
- Combining Automatic Speaker Verification And Prosody Analysis For Synthetic Speech Detection (2022)10.48
- AUDETER: A Large-scale Dataset For Deepfake Audio Detection In Open Worlds (2025)0.00
- Anomaly Detection And Localization For Speech Deepfakes Via Feature Pyramid Matching (2025)4.52
- Vulnerability Of Automatic Identity Recognition To Audio-visual Deepfakes (2023)6.77
- Asvspoof 2021: Towards Spoofed And Deepfake Speech Detection In The Wild (2022)17.95