Disentangled Speaker Representation Learning Via Mutual Information Minimization
2022 Β· Sung Hwan Mun, Min Hyun Han, Minchan Kim, et al.
Abstract
Domain mismatch problem caused by speaker-unrelated feature has been a major topic in speaker recognition. In this paper, we propose an explicit disentanglement framework to unravel speaker-relevant features from speaker-unrelated features via mutual information (MI) minimization. To achieve our goal of minimizing MI between speaker-related and speaker-unrelated features, we adopt a contrastive log-ratio upper bound (CLUB), which exploits the upper bound of MI. Our framework is constructed in a 3-stage structure. First, in the front-end encoder, input speech is encoded into shared initial embedding. Next, in the decoupling block, shared initial embedding is split into separate speaker-related and speaker-unrelated embeddings. Finally, disentanglement is conducted by MI minimization in the last stage. Experiments on Far-Field Speaker Verification Challenge 2022 (FFSVC2022) demonstrate that our proposed framework is effective for disentanglement. Also, to utilize domain-unknown datasets
Authors
(none)
Tags
Stats
Related papers
- Intra-class Variation Reduction Of Speaker Representation In Disentanglement Framework (2020)8.35
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- Disentangled Representation Learning For Environment-agnostic Speaker Recognition (2024)4.82
- Disentangling Age And Identity With A Mutual Information Minimization Approach For Cross-age Speaker Verification (2024)2.26
- Learning Speaker Representations With Mutual Information (2018)11.76
- VQMIVC: Vector Quantization And Mutual Information-based Unsupervised Speech Representation Disentanglement For One-shot Voice Conversion (2021)20.31
- Label-efficient Self-supervised Speaker Verification With Information Maximization And Contrastive Learning (2022)6.77
- Speaker Representation Learning Via Contrastive Loss With Maximal Speaker Separability (2022)10.68