AASIST3: Kan-enhanced AASIST Speech Deepfake Detection Using SSL Features And Additional Regularization For The Asvspoof 2024 Challenge
2024 · Kirill Borodin, Vasiliy Kudryavtsev, Dmitrii Korzh, et al.
Abstract
Automatic Speaker Verification (ASV) systems, which identify speakers based on their voice characteristics, have numerous applications, such as user authentication in financial transactions, exclusive access control in smart devices, and forensic fraud detection. However, the advancement of deep learning algorithms has enabled the generation of synthetic audio through Text-to-Speech (TTS) and Voice Conversion (VC) systems, exposing ASV systems to potential vulnerabilities. To counteract this, we propose a novel architecture named AASIST3. By enhancing the existing AASIST framework with Kolmogorov-Arnold networks, additional layers, encoders, and pre-emphasis techniques, AASIST3 achieves a more than twofold improvement in performance. It demonstrates minDCF results of 0.5357 in the closed condition and 0.1414 in the open condition, significantly enhancing the detection of synthetic voices and improving ASV security.
Authors
(none)
Tags
Stats
Related papers
- Asasvicomtech: The Vicomtech-ugr Speech Deepfake Detection And SASV Systems For The Asvspoof5 Challenge (2024)5.24
- Improving Short Utterance Anti-spoofing With AASIST2 (2023)11.49
- Automatic Speaker Verification Spoofing And Deepfake Detection Using Wav2vec 2.0 And Data Augmentation (2022)17.35
- Securing Voice Biometrics: One-shot Learning Approach For Audio Deepfake Detection (2023)9.03
- Temporal Variability And Multi-viewed Self-supervised Representations To Tackle The Asvspoof5 Deepfake Challenge (2024)0.00
- Application Of ASV For Voice Identification After VC And Duration Predictor Improvement In TTS Models (2024)0.00
- Asvspoof 2021: Towards Spoofed And Deepfake Speech Detection In The Wild (2022)17.95
- Joint Optimization Of Speaker And Spoof Detectors For Spoofing-robust Automatic Speaker Verification (2025)0.00