Experimental Study: Enhancing Voice Spoofing Detection Models With Wav2vec 2.0
2024 Β· Taein Kang, Soyul Han, Sunmook Choi, et al.
Abstract
Conventional spoofing detection systems have heavily relied on the use of handcrafted features derived from speech data. However, a notable shift has recently emerged towards the direct utilization of raw speech waveforms, as demonstrated by methods like SincNet filters. This shift underscores the demand for more sophisticated audio sample features. Moreover, the success of deep learning models, particularly those utilizing large pretrained wav2vec 2.0 as a featurization front-end, highlights the importance of refined feature encoders. In response, this research assessed the representational capability of wav2vec 2.0 as an audio feature extractor, modifying the size of its pretrained Transformer layers through two key adjustments: (1) selecting a subset of layers starting from the leftmost one and (2) fine-tuning a portion of the selected layers from the rightmost one. We complemented this analysis with five spoofing detection back-end models, with a primary focus on AASIST, enabling u
Authors
(none)
Tags
Stats
Related papers
- Representation Selective Self-distillation And Wav2vec 2.0 Feature Exploration For Spoof-aware Speaker Verification (2022)9.03
- Automatic Speaker Verification Spoofing And Deepfake Detection Using Wav2vec 2.0 And Data Augmentation (2022)17.35
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- Exploring Wavlm Back-ends For Speech Spoofing And Deepfake Detection (2024)4.52
- Toward Improving Synthetic Audio Spoofing Detection Robustness Via Meta-learning And Disentangled Training With Adversarial Examples (2024)6.77
- Mixture Of Experts Fusion For Fake Audio Detection Using Frozen Wav2vec 2.0 (2024)0.00
- A Comparative Study On Recent Neural Spoofing Countermeasures For Synthetic Speech Detection (2021)0.00
- Detection Of Doctored Speech: Towards An End-to-end Parametric Learn-able Filter Approach (2022)0.00