Intra-class Variation Reduction Of Speaker Representation In Disentanglement Framework
2020 Β· Yoohwan Kwon, Soo-Whan Chung, Hong-Goo Kang
Abstract
In this paper, we propose an effective training strategy to ex-tract robust speaker representations from a speech signal. Oneof the key challenges in speaker recognition tasks is to learnlatent representations or embeddings containing solely speakercharacteristic information in order to be robust in terms of intra-speaker variations. By modifying the network architecture togenerate both speaker-related and speaker-unrelated representa-tions, we exploit a learning criterion which minimizes the mu-tual information between these disentangled embeddings. Wealso introduce an identity change loss criterion which utilizes areconstruction error to different utterances spoken by the samespeaker. Since the proposed criteria reduce the variation ofspeaker characteristics caused by changes in background envi-ronment or spoken content, the resulting embeddings of eachspeaker become more consistent. The effectiveness of the pro-posed method is demonstrated through two tasks; disentangle-ment perform
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Disentangled Representation Learning For Robust Target Speech Extraction (2023)5.24
- Disentangled Representation Learning For Environment-agnostic Speaker Recognition (2024)4.82
- Contentvec: An Improved Self-supervised Speech Representation By Disentangling Speakers (2022)0.00
- Disentangled Speaker Representation Learning Via Mutual Information Minimization (2022)5.24
- Advancing The Dimensionality Reduction Of Speaker Embeddings For Speaker Diarisation: Disentangling Noise And Informing Speech Activity (2021)2.26
- Disentangled Representation Learning For Multilingual Speaker Recognition (2022)6.34
- Disentangling Voice And Content With Self-supervision For Speaker Recognition (2023)2.26
- Robust Speaker Recognition Using Unsupervised Adversarial Invariance (2019)9.76