The Sound Of Silence: Efficiency Of First Digit Features In Synthetic Audio Detection
2022 Β· Daniele Mari, Federica Latora, Simone Milani
Abstract
The recent integration of generative neural strategies and audio processing techniques have fostered the widespread of synthetic speech synthesis or transformation algorithms. This capability proves to be harmful in many legal and informative processes (news, biometric authentication, audio evidence in courts, etc.). Thus, the development of efficient detection algorithms is both crucial and challenging due to the heterogeneity of forgery techniques. This work investigates the discriminative role of silenced parts in synthetic speech detection and shows how first digit statistics extracted from MFCC coefficients can efficiently enable a robust detection. The proposed procedure is computationally-lightweight and effective on many different algorithms since it does not rely on large neural detection architecture and obtains an accuracy above 90% in most of the classes of the ASVSpoof dataset.
Authors
(none)
Tags
Stats
Related papers
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- Detection Of Ai-synthesized Speech Using Cepstral & Bispectral Statistics (2020)0.00
- Combining Automatic Speaker Verification And Prosody Analysis For Synthetic Speech Detection (2022)10.48
- Lightweight Model Attribution And Detection Of Synthetic Speech Via Audio Residual Fingerprints (2024)0.00
- Securing Voice Biometrics: One-shot Learning Approach For Audio Deepfake Detection (2023)9.03
- Evince The Artifacts Of Spoof Speech By Blending Vocal Tract And Voice Source Features (2022)0.00
- Detecting The Undetectable: Assessing The Efficacy Of Current Spoof Detection Methods Against Seamless Speech Edits (2025)0.00
- Syn-att: Synthetic Speech Attribution Via Semi-supervised Unknown Multi-class Ensemble Of Cnns (2023)0.00