Asymmetric Clean Segments-guided Self-supervised Learning For Robust Speaker Verification
2023 Β· Chong-Xin Gan, Man-Wai Mak, Weiwei Lin, et al.
Abstract
Contrastive self-supervised learning (CSL) for speaker verification (SV) has drawn increasing interest recently due to its ability to exploit unlabeled data. Performing data augmentation on raw waveforms, such as adding noise or reverberation, plays a pivotal role in achieving promising results in SV. Data augmentation, however, demands meticulous calibration to ensure intact speaker-specific information, which is difficult to achieve without speaker labels. To address this issue, we introduce a novel framework by incorporating clean and augmented segments into the contrastive training pipeline. The clean segments are repurposed to pair with noisy segments to form additional positive and negative pairs. Moreover, the contrastive loss is weighted to increase the difference between the clean and augmented embeddings of different speakers. Experimental results on Voxceleb1 suggest that the proposed framework can achieve a remarkable 19% improvement over the conventional methods, and it su
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Speaker Verification With Simple Siamese Network And Self-supervised Regularization (2021)10.85
- C3-DINO: Joint Contrastive And Non-contrastive Self-supervised Learning For Speaker Verification (2022)10.21
- Towards Supervised Performance On Speaker Verification With Self-supervised Learning By Leveraging Large-scale ASR Models (2024)7.50
- Augmentation Adversarial Training For Self-supervised Speaker Recognition (2020)0.00
- Experimenting With Additive Margins For Contrastive Self-supervised Speaker Verification (2023)4.52
- Additive Margin In Contrastive Self-supervised Frameworks To Learn Discriminative Speaker Representations (2024)2.26
- Label-efficient Self-supervised Speaker Verification With Information Maximization And Contrastive Learning (2022)6.77
- Self-supervised Text-independent Speaker Verification Using Prototypical Momentum Contrastive Learning (2020)12.93