Pseudo-siamese Network Based Timbre-reserved Black-box Adversarial Attack In Speaker Identification
2023 Β· Qing Wang, Jixun Yao, Ziqian Wang, et al.
Abstract
In this study, we propose a timbre-reserved adversarial attack approach for speaker identification (SID) to not only exploit the weakness of the SID model but also preserve the timbre of the target speaker in a black-box attack setting. Particularly, we generate timbre-reserved fake audio by adding an adversarial constraint during the training of the voice conversion model. Then, we leverage a pseudo-Siamese network architecture to learn from the black-box SID model constraining both intrinsic similarity and structural similarity simultaneously. The intrinsic similarity loss is to learn an intrinsic invariance, while the structural similarity loss is to ensure that the substitute SID model shares a similar decision boundary to the fixed black-box SID model. The substitute model can be used as a proxy to generate timbre-reserved fake audio for attacking. Experimental results on the Audio Deepfake Detection (ADD) challenge dataset indicate that the attack success rate of our proposed app
Authors
(none)
Tags
Stats
Related papers
- Diffattack: Diffusion-based Timbre-reserved Adversarial Attack In Speaker Identification (2025)0.00
- Symmetric Saliency-based Adversarial Attack To Speaker Identification (2022)8.60
- Targeted Adversarial Examples For Black Box Audio Systems (2018)15.75
- Inaudible Adversarial Perturbations For Targeted Attack In Speaker Recognition (2020)12.33
- Impact Of Phonetics On Speaker Identity In Adversarial Voice Attack (2025)0.00
- Adversarial Defense For Deep Speaker Recognition Using Hybrid Adversarial Training (2020)9.59
- Foolhd: Fooling Speaker Identification By Highly Imperceptible Adversarial Disturbances (2020)10.07
- Towards Understanding And Mitigating Audio Adversarial Examples For Speaker Recognition (2022)11.67