Exploring Wavlm Back-ends For Speech Spoofing And Deepfake Detection
2024 · Theophile Stourbe, Victor Miara, Theo Lepage, et al.
Abstract
This paper describes our submitted systems to the ASVspoof 5 Challenge Track 1: Speech Deepfake Detection - Open Condition, which consists of a stand-alone speech deepfake (bonafide vs spoof) detection task. Recently, large-scale self-supervised models become a standard in Automatic Speech Recognition (ASR) and other speech processing tasks. Thus, we leverage a pre-trained WavLM as a front-end model and pool its representations with different back-end techniques. The complete framework is fine-tuned using only the trained dataset of the challenge, similar to the close condition. Besides, we adopt data-augmentation by adding noise and reverberation using MUSAN noise and RIR datasets. We also experiment with codec augmentations to increase the performance of our method. Ultimately, we use the Bosaris toolkit for score calibration and system fusion to get better Cllr scores. Our fused system achieves 0.0937 minDCF, 3.42% EER, 0.1927 Cllr, and 0.1375 actDCF.
Authors
(none)
Tags
Stats
Related papers
- Asasvicomtech: The Vicomtech-ugr Speech Deepfake Detection And SASV Systems For The Asvspoof5 Challenge (2024)5.24
- Automatic Speaker Verification Spoofing And Deepfake Detection Using Wav2vec 2.0 And Data Augmentation (2022)17.35
- Temporal Variability And Multi-viewed Self-supervised Representations To Tackle The Asvspoof5 Deepfake Challenge (2024)0.00
- Experimental Study: Enhancing Voice Spoofing Detection Models With Wav2vec 2.0 (2024)0.00
- The Vicomtech Audio Deepfake Detection System Based On Wav2vec2 For The 2022 ADD Challenge (2022)14.06
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- Attentive Merging Of Hidden Embeddings From Pre-trained Speech Model For Anti-spoofing Detection (2024)0.00
- Detection Of Doctored Speech: Towards An End-to-end Parametric Learn-able Filter Approach (2022)0.00