Vsmask: Defending Against Voice Synthesis Attack Via Real-time Predictive Perturbation
2023 Β· Yuanda Wang, Hanqing Guo, Guangjing Wang, et al.
Abstract
Deep learning based voice synthesis technology generates artificial human-like speeches, which has been used in deepfakes or identity theft attacks. Existing defense mechanisms inject subtle adversarial perturbations into the raw speech audios to mislead the voice synthesis models. However, optimizing the adversarial perturbation not only consumes substantial computation time, but it also requires the availability of entire speech. Therefore, they are not suitable for protecting live speech streams, such as voice messages or online meetings. In this paper, we propose VSMask, a real-time protection mechanism against voice synthesis attacks. Different from offline protection schemes, VSMask leverages a predictive neural network to forecast the most effective perturbation for the upcoming streaming speech. VSMask introduces a universal perturbation tailored for arbitrary speech input to shield a real-time speech in its entirety. To minimize the audio distortion within the protected speech
Authors
(none)
Tags
Stats
Related papers
- Safespeech: Robust And Universal Voice Protection Against Malicious Speech Synthesis (2025)0.00
- Adversarial Speech For Voice Privacy Protection From Personalized Speech Generation (2024)8.09
- Defense Against Synthetic Speech: Real-time Detection Of RVC Voice Conversion Attacks (2025)0.00
- One-class Learning Towards Synthetic Voice Spoofing Detection (2020)17.31
- Securing Voice Biometrics: One-shot Learning Approach For Audio Deepfake Detection (2023)9.03
- Toward Improving Synthetic Audio Spoofing Detection Robustness Via Meta-learning And Disentangled Training With Adversarial Examples (2024)6.77
- Deep Residual Neural Networks For Audio Spoofing Detection (2019)0.00
- Securing Voice-driven Interfaces Against Fake (cloned) Audio Attacks (2019)9.92