Adaptive Re-calibration Of Channel-wise Features For Adversarial Audio Classification
2022 · Vardhan Dongre, Abhinav Thimma Reddy, Nikhitha Reddeddy
Abstract
DeepFake Audio, unlike DeepFake images and videos, has been relatively less explored from detection perspective, and the solutions which exist for the synthetic speech classification either use complex networks or dont generalize to different varieties of synthetic speech obtained using different generative and optimization-based methods. Through this work, we propose a channel-wise recalibration of features using attention feature fusion for synthetic speech detection and compare its performance against different detection methods including End2End models and Resnet-based models on synthetic speech generated using Text to Speech and Vocoder systems like WaveNet, WaveRNN, Tactotron, and WaveGlow. We also experiment with Squeeze Excitation (SE) blocks in our Resnet models and found that the combination was able to get better performance. In addition to the analysis, we also demonstrate that the combination of Linear frequency cepstral coefficients (LFCC) and Mel Frequency cepstral coeff
Authors
(none)
Tags
Stats
Related papers
- Self-attention And Hybrid Features For Replay And Deep-fake Audio Detection (2024)0.00
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- What To Remember: Self-adaptive Continual Learning For Audio Deepfake Detection (2023)10.48
- MFAAN: Unveiling Audio Deepfakes With A Multi-feature Authenticity Network (2023)7.81
- Combining Automatic Speaker Verification And Prosody Analysis For Synthetic Speech Detection (2022)10.48
- Securing Voice Biometrics: One-shot Learning Approach For Audio Deepfake Detection (2023)9.03
- The Vicomtech Audio Deepfake Detection System Based On Wav2vec2 For The 2022 ADD Challenge (2022)14.06
- Adversarial Attacks On Audio Deepfake Detection: A Benchmark And Comparative Study (2025)0.00