Self-supervised Learning With Cluster-aware-dino For High-performance Robust Speaker Verification
2023 Β· Bing Han, Zhengyang Chen, Yanmin Qian
Abstract
Automatic speaker verification task has made great achievements using deep learning approaches with the large-scale manually annotated dataset. However, it's very difficult and expensive to collect a large amount of well-labeled data for system building. In this paper, we propose a novel and advanced self-supervised learning framework which can construct a high performance speaker verification system without using any labeled data. To avoid the impact of false negative pairs, we adopt the self-distillation with no labels (DINO) framework as the initial model, which can be trained without exploiting negative pairs. Then, we introduce a cluster-aware training strategy for DINO to improve the diversity of data. In the iteration learning stage, due to a mass of unreliable labels from clustering, the quality of pseudo labels is important for the system training. This motivates us to propose dynamic loss-gate and label correction (DLG-LC) methods to alleviate the performance degradation caus
Authors
(none)
Tags
Stats
Related papers
- Pushing The Limits Of Self-supervised Speaker Verification Using Regularized Distillation Framework (2022)17.00
- DINO-VITS: Data-efficient Zero-shot TTS With Self-supervised Speaker Verification Loss For Noise Robustness (2023)3.58
- Self-supervised Speaker Verification Using Dynamic Loss-gate And Label Correction (2022)10.74
- Curriculum Learning For Self-supervised Speaker Verification (2022)8.09
- Dinosr: Self-distillation And Online Clustering For Self-supervised Speech Representation Learning (2023)0.00
- Leveraging In-the-wild Data For Effective Self-supervised Pretraining In Speaker Recognition (2023)3.58
- Self-distillation Prototypes Network: Learning Robust Speaker Representations Without Supervision (2023)4.52
- Self-supervised Learning Based Domain Adaptation For Robust Speaker Verification (2021)11.49