A Joint Noise Disentanglement And Adversarial Training Framework For Robust Speaker Verification
2024 Β· Xujiang Xing, Mingxing Xu, Thomas Fang Zheng
Abstract
Automatic Speaker Verification (ASV) suffers from performance degradation in noisy conditions. To address this issue, we propose a novel adversarial learning framework that incorporates noise-disentanglement to establish a noise-independent speaker invariant embedding space. Specifically, the disentanglement module includes two encoders for separating speaker related and irrelevant information, respectively. The reconstruction module serves as a regularization term to constrain the noise. A feature-robust loss is also used to supervise the speaker encoder to learn noise-independent speaker embeddings without losing speaker information. In addition, adversarial training is introduced to discourage the speaker encoder from encoding acoustic condition information for achieving a speaker-invariant embedding space. Experiments on VoxCeleb1 indicate that the proposed method improves the performance of the speaker verification system under both clean and noisy conditions.
Authors
(none)
Tags
Stats
Related papers
- Disentangled Speaker And Nuisance Attribute Embedding For Robust Speaker Verification (2020)8.60
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- Noise-conditioned Mixture-of-experts Framework For Robust Speaker Verification (2025)0.00
- Within-sample Variability-invariant Loss For Robust Speaker Recognition Under Noisy Environments (2020)11.85
- SEEF-ALDR: A Speaker Embedding Enhancement Framework Via Adversarial Learning Based Disentangled Representation (2019)3.58
- Adversarial Network Bottleneck Features For Noise Robust Speaker Verification (2017)9.59
- Disentangled Representation Learning For Environment-agnostic Speaker Recognition (2024)4.82
- Diffusion-based Adversarial Purification For Speaker Verification (2023)6.34