XWSB: A Blend System Utilizing XLS-R And Wavlm With SLS Classifier Detection System For SVDD 2024 Challenge
2024 · Qishan Zhang, Shuangbing Wen, Fangke Yan, et al.
Abstract
This paper introduces the model structure used in the SVDD 2024 Challenge. The SVDD 2024 challenge has been introduced this year for the first time. Singing voice deepfake detection (SVDD) which faces complexities due to informal speech intonations and varying speech rates. In this paper, we propose the XWSB system, which achieved SOTA per-formance in the SVDD challenge. XWSB stands for XLS-R, WavLM, and SLS Blend, representing the integration of these technologies for the purpose of SVDD. Specifically, we used the best performing model structure XLS-R&SLS from the ASVspoof DF dataset, and applied SLS to WavLM to form the WavLM&SLS structure. Finally, we integrated two models to form the XWSB system. Experimental results show that our system demonstrates advanced recognition capabilities in the SVDD challenge, specifically achieving an EER of 2.32% in the CtrSVDD track. The code and data can be found at https://github.com/QiShanZhang/XWSB_for_ SVDD2024.
Authors
(none)
Tags
Stats
Code
Related papers
- Exploring Wavlm Back-ends For Speech Spoofing And Deepfake Detection (2024)4.52
- The SVASR System For Text-dependent Speaker Verification (tdsv) AAIC Challenge 2024 (2024)0.00
- Speech Foundation Model Ensembles For The Controlled Singing Voice Deepfake Detection (ctrsvdd) Challenge 2024 (2024)11.14
- SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan (2024)0.00
- Svsnet+: Enhancing Speaker Voice Similarity Assessment Models With Representations From Speech Foundation Models (2024)0.00
- Vits-based Singing Voice Conversion System With DSPGAN Post-processing For SVCC2023 (2023)5.84
- Asasvicomtech: The Vicomtech-ugr Speech Deepfake Detection And SASV Systems For The Asvspoof5 Challenge (2024)5.24
- The Vicomtech Audio Deepfake Detection System Based On Wav2vec2 For The 2022 ADD Challenge (2022)14.06