Syn-att: Synthetic Speech Attribution Via Semi-supervised Unknown Multi-class Ensemble Of Cnns
2023 Β· Md Awsafur Rahman, Bishmoy Paul, Najibul Haque Sarker, et al.
Abstract
With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural speech has become an urgent necessity. Moreover, being able to tell which algorithm has been used to generate a synthetic speech track can be of preeminent importance to track down the culprit. In this paper, a novel strategy is proposed to attribute a synthetic speech track to the generator that is used to synthesize it. The proposed detector transforms the audio into log-mel spectrogram, extracts features using CNN, and classifies it between five known and unknown algorithms, utilizing semi-supervision and ensemble to improve its robustness and generalizability significantly. The proposed detec
Authors
(none)
Tags
Stats
Related papers
- Lightweight Model Attribution And Detection Of Synthetic Speech Via Audio Residual Fingerprints (2024)0.00
- Combining Automatic Speaker Verification And Prosody Analysis For Synthetic Speech Detection (2022)10.48
- Using Deep Learning Techniques And Inferential Speech Statistics For AI Synthesised Speech Recognition (2021)0.00
- Detection Of Ai-synthesized Speech Using Cepstral & Bispectral Statistics (2020)0.00
- Speech-forensics: Towards Comprehensive Synthetic Speech Dataset Establishment And Analysis (2024)3.58
- One-class Learning Towards Synthetic Voice Spoofing Detection (2020)17.31
- Synthetic Speech Classification: IEEE Signal Processing Cup 2022 Challenge (2024)0.00
- Light Convolutional Neural Network With Feature Genuinization For Detection Of Synthetic Speech Attacks (2020)13.97