Disentangled Speaker And Nuisance Attribute Embedding For Robust Speaker Verification
2020 Β· Woo Hyun Kang, Sung Hwan Mun, Min Hyun Han, et al.
Abstract
Over the recent years, various deep learning-based embedding methods have been proposed and have shown impressive performance in speaker verification. However, as in most of the classical embedding techniques, the deep learning-based methods are known to suffer from severe performance degradation when dealing with speech samples with different conditions (e.g., recording devices, emotional states). In this paper, we propose a novel fully supervised training method for extracting a speaker embedding vector disentangled from the variability caused by the nuisance attributes. The proposed framework was compared with the conventional deep learning-based embedding methods using the RSR2015 and VoxCeleb1 dataset. Experimental results show that the proposed approach can extract speaker embeddings robust to channel and emotional variability.
Authors
(none)
Tags
Stats
Related papers
- A Joint Noise Disentanglement And Adversarial Training Framework For Robust Speaker Verification (2024)6.34
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- Disentangled Representation Learning For Environment-agnostic Speaker Recognition (2024)4.82
- Within-sample Variability-invariant Loss For Robust Speaker Recognition Under Noisy Environments (2020)11.85
- Feature Enhancement With Deep Feature Losses For Speaker Verification (2019)10.61
- Intra-class Variation Reduction Of Speaker Representation In Disentanglement Framework (2020)8.35
- Deep Speaker Embeddings For Far-field Speaker Recognition On Short Utterances (2020)11.29
- Leveraging Speaker Attribute Information Using Multi Task Learning For Speaker Verification And Diarization (2020)6.34