Improving Short Utterance Anti-spoofing With AASIST2
2023 Β· Yuxiang Zhang, Jingze Lu, Zengqiang Shang, et al.
Abstract
The wav2vec 2.0 and integrated spectro-temporal graph attention network (AASIST) based countermeasure achieves great performance in speech anti-spoofing. However, current spoof speech detection systems have fixed training and evaluation durations, while the performance degrades significantly during short utterance evaluation. To solve this problem, AASIST can be improved to AASIST2 by modifying the residual blocks to Res2Net blocks. The modified Res2Net blocks can extract multi-scale features and improve the detection performance for speech of different durations, thus improving the short utterance evaluation performance. On the other hand, adaptive large margin fine-tuning (ALMFT) has achieved performance improvement in short utterance speaker verification. Therefore, we apply Dynamic Chunk Size (DCS) and ALMFT training strategies in speech anti-spoofing to further improve the performance of short utterance evaluation. Experiments demonstrate that the proposed AASIST2 improves the per
Authors
(none)
Tags
Stats
Related papers
- Experimental Study: Enhancing Voice Spoofing Detection Models With Wav2vec 2.0 (2024)0.00
- AASIST3: Kan-enhanced AASIST Speech Deepfake Detection Using SSL Features And Additional Regularization For The Asvspoof 2024 Challenge (2024)9.03
- Automatic Speaker Verification Spoofing And Deepfake Detection Using Wav2vec 2.0 And Data Augmentation (2022)17.35
- Attentive Activation Function For Improving End-to-end Spoofing Countermeasure Systems (2022)0.00
- Representation Selective Self-distillation And Wav2vec 2.0 Feature Exploration For Spoof-aware Speaker Verification (2022)9.03
- Toward Improving Synthetic Audio Spoofing Detection Robustness Via Meta-learning And Disentangled Training With Adversarial Examples (2024)6.77
- Anti-spoofing Methods For Automatic Speakerverification System (2017)2.26
- A Comparative Study On Recent Neural Spoofing Countermeasures For Synthetic Speech Detection (2021)0.00