Towards Evaluating The Robustness Of Automatic Speech Recognition Systems Via Audio Style Transfer
2024 · Weifei Jin, Yuxin Cao, Junjie Su, et al.
Abstract
In light of the widespread application of Automatic Speech Recognition (ASR) systems, their security concerns have received much more attention than ever before, primarily due to the susceptibility of Deep Neural Networks. Previous studies have illustrated that surreptitiously crafting adversarial perturbations enables the manipulation of speech recognition systems, resulting in the production of malicious commands. These attack methods mostly require adding noise perturbations under \(\ell_p\) norm constraints, inevitably leaving behind artifacts of manual modifications. Recent research has alleviated this limitation by manipulating style vectors to synthesize adversarial examples based on Text-to-Speech (TTS) synthesis audio. However, style modifications based on optimization objectives significantly reduce the controllability and editability of audio styles. In this paper, we propose an attack on ASR systems based on user-customized style transfer. We first test the effect of Style
Authors
(none)
Tags
Stats
Related papers
- Adversarial Attacks Against Automatic Speech Recognition Systems Via Psychoacoustic Hiding (2018)16.45
- Improving Performance Of Seen And Unseen Speech Style Transfer In End-to-end Neural TTS (2021)6.34
- Time Domain Neural Audio Style Transfer (2017)0.00
- Towards Understanding And Mitigating Audio Adversarial Examples For Speaker Recognition (2022)11.67
- STYLER: Style Factor Modeling With Rapidity And Robustness Via Speech Decomposition For Expressive And Controllable Neural Text To Speech (2021)9.23
- Targeted Adversarial Examples For Black Box Audio Systems (2018)15.75
- Universal Adversarial Perturbations For Speech Recognition Systems (2019)14.11
- Audio Adversarial Examples For Robust Hybrid Ctc/attention Speech Recognition (2020)3.58