Light Convolutional Neural Network With Feature Genuinization For Detection Of Synthetic Speech Attacks
2020 Β· Zhenzong Wu, Rohan Kumar Das, Jichen Yang, et al.
Abstract
Modern text-to-speech (TTS) and voice conversion (VC) systems produce natural sounding speech that questions the security of automatic speaker verification (ASV). This makes detection of such synthetic speech very important to safeguard ASV systems from unauthorized access. Most of the existing spoofing countermeasures perform well when the nature of the attacks is made known to the system during training. However, their performance degrades in face of unseen nature of attacks. In comparison to the synthetic speech created by a wide range of TTS and VC methods, genuine speech has a more consistent distribution. We believe that the difference between the distribution of synthetic and genuine speech is an important discriminative feature between the two classes. In this regard, we propose a novel method referred to as feature genuinization that learns a transformer with convolutional neural network (CNN) using the characteristics of only genuine speech. We then use this genuinization tra
Authors
(none)
Tags
Stats
Related papers
- One-class Learning Towards Synthetic Voice Spoofing Detection (2020)17.31
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- Defense Against Synthetic Speech: Real-time Detection Of RVC Voice Conversion Attacks (2025)0.00
- Securing Voice Biometrics: One-shot Learning Approach For Audio Deepfake Detection (2023)9.03
- Combining Automatic Speaker Verification And Prosody Analysis For Synthetic Speech Detection (2022)10.48
- Syn-att: Synthetic Speech Attribution Via Semi-supervised Unknown Multi-class Ensemble Of Cnns (2023)0.00
- Transforming Acoustic Characteristics To Deceive Playback Spoofing Countermeasures Of Speaker Verification Systems (2018)6.34
- Using Deep Learning Techniques And Inferential Speech Statistics For AI Synthesised Speech Recognition (2021)0.00