AUDETER: A Large-scale Dataset For Deepfake Audio Detection In Open Worlds
2025 Β· Qizhou Wang, Hanxun Huang, Guansong Pang, et al.
Abstract
Speech synthesis systems can now produce highly realistic vocalisations that pose significant authenticity challenges. Despite substantial progress in deepfake detection models, their real-world effectiveness is often undermined by evolving distribution shifts between training and test data, driven by the complexity of human speech and the rapid evolution of synthesis systems. Existing datasets suffer from limited real speech diversity, insufficient coverage of recent synthesis systems, and heterogeneous mixtures of deepfake sources, which hinder systematic evaluation and open-world model training. To address these issues, we introduce AUDETER (AUdio DEepfake TEst Range), a large-scale and highly diverse deepfake audio dataset comprising over 4,500 hours of synthetic audio generated by 11 recent TTS models and 10 vocoders, totalling 3 million clips. We further observe that most existing detectors default to binary supervised training, which can induce negative transfer across synthesis
Authors
(none)
Tags
Stats
Related papers
- MLAAD: The Multi-language Audio Anti-spoofing Dataset (2024)13.34
- Towards Robust Audio Deepfake Detection: A Evolving Benchmark For Continual Learning (2024)0.00
- Adversarial Attacks On Audio Deepfake Detection: A Benchmark And Comparative Study (2025)0.00
- Xmad-bench: Cross-domain Multilingual Audio Deepfake Benchmark (2025)1.69
- Deepfake Audio As A Data Augmentation Technique For Training Automatic Speech To Text Transcription Models (2023)2.26
- The Codecfake Dataset And Countermeasures For The Universally Detection Of Deepfake Audio (2024)10.97
- FADEL: Uncertainty-aware Fake Audio Detection With Evidential Deep Learning (2025)0.00
- Vulnerability Of Automatic Identity Recognition To Audio-visual Deepfakes (2023)6.77