Self-distillation Prototypes Network: Learning Robust Speaker Representations Without Supervision
2023 Β· Yafeng Chen, Siqi Zheng, Hui Wang, et al.
Abstract
Training speaker-discriminative and robust speaker verification systems without explicit speaker labels remains a persistent challenge. In this paper, we propose a novel self-supervised speaker verification approach, Self-Distillation Prototypes Network (SDPN), which effectively facilitates self-supervised speaker representation learning. SDPN assigns the representation of the augmented views of an utterance to the same prototypes as the representation of the original view, thereby enabling effective knowledge transfer between the augmented and original views. Due to lack of negative pairs in the SDPN training process, the network tends to align positive pairs quite closely in the embedding space, a phenomenon known as model collapse. To mitigate this problem, we introduce a diversity regularization term to embeddings in SDPN. Comprehensive experiments on the VoxCeleb datasets demonstrate the superiority of SDPN among self-supervised speaker verification approaches. SDPN sets a new sta
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Speaker Verification With Simple Siamese Network And Self-supervised Regularization (2021)10.85
- Pushing The Limits Of Self-supervised Speaker Verification Using Regularized Distillation Framework (2022)17.00
- Augmentation Adversarial Training For Self-supervised Speaker Recognition (2020)0.00
- An Iterative Framework For Self-supervised Deep Speaker Representation Learning (2020)10.61
- Self-supervised Learning With Cluster-aware-dino For High-performance Robust Speaker Verification (2023)0.00
- Curriculum Learning For Self-supervised Speaker Verification (2022)8.09
- Self-supervised Training Of Speaker Encoder With Multi-modal Diverse Positive Pairs (2022)8.35
- Self-supervised Reflective Learning Through Self-distillation And Online Clustering For Speaker Representation Learning (2024)2.26