Bootstrap Equilibrium And Probabilistic Speaker Representation Learning For Self-supervised Speaker Verification
2021 Β· Sung Hwan Mun, Min Hyun Han, Dongjune Lee, et al.
Abstract
In this paper, we propose self-supervised speaker representation learning strategies, which comprise of a bootstrap equilibrium speaker representation learning in the front-end and an uncertainty-aware probabilistic speaker embedding training in the back-end. In the front-end stage, we learn the speaker representations via the bootstrap training scheme with the uniformity regularization term. In the back-end stage, the probabilistic speaker embeddings are estimated by maximizing the mutual likelihood score between the speech samples belonging to the same speaker, which provide not only speaker representations but also data uncertainty. Experimental results show that the proposed bootstrap equilibrium training strategy can effectively help learn the speaker representations and outperforms the conventional methods based on contrastive learning. Also, we demonstrate that the integrated two-stage framework further improves the speaker verification performance on the VoxCeleb1 test set in t
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Representation Learning For Speaker Recognition Via Contrastive Equilibrium Learning (2020)0.00
- Curriculum Learning For Self-supervised Speaker Verification (2022)8.09
- Label-efficient Self-supervised Speaker Verification With Information Maximization And Contrastive Learning (2022)6.77
- An Iterative Framework For Self-supervised Deep Speaker Representation Learning (2020)10.61
- Self-distillation Prototypes Network: Learning Robust Speaker Representations Without Supervision (2023)4.52
- Self-supervised Speaker Verification With Simple Siamese Network And Self-supervised Regularization (2021)10.85
- Self-supervised Text-independent Speaker Verification Using Prototypical Momentum Contrastive Learning (2020)12.93
- Asymmetric Clean Segments-guided Self-supervised Learning For Robust Speaker Verification (2023)5.84