Improved Meta-learning Training For Speaker Verification
2021 Β· Yafeng Chen, Wu Guo, Bin Gu
Abstract
Meta-learning has recently become a research hotspot in speaker verification (SV). We introduce two methods to improve the meta-learning training for SV in this paper. For the first method, a backbone embedding network is first jointly trained with the conventional cross entropy loss and prototypical networks (PN) loss. Then, inspired by speaker adaptive training in speech recognition, additional transformation coefficients are trained with only the PN loss. The transformation coefficients are used to modify the original backbone embedding network in the x-vector extraction process. Furthermore, the random erasing data augmentation technique is applied to all support samples in each episode to construct positive pairs, and a contrastive loss between the augmented and the original support samples is added to the objective in model training. Experiments are carried out on the SITW and VOiCES databases. Both of the methods can obtain consistent improvements over existing meta-learning tra
Authors
(none)
Tags
Stats
Related papers
- Improved Relation Networks For End-to-end Speaker Verification And Identification (2022)2.26
- Multi-task Metric Learning For Text-independent Speaker Verification (2020)0.00
- Adapting End-to-end Neural Speaker Verification To New Languages And Recording Conditions With Adversarial Training (2018)9.59
- Improving Embedding Extraction For Speaker Verification With Ladder Network (2020)0.00
- Self-supervised Text-independent Speaker Verification Using Prototypical Momentum Contrastive Learning (2020)12.93
- Learning Metrics From Mean Teacher: A Supervised Learning Method For Improving The Generalization Of Speaker Verification System (2021)0.00
- Improving Transformer-based Networks With Locality For Automatic Speaker Verification (2023)0.00
- Label-efficient Self-supervised Speaker Verification With Information Maximization And Contrastive Learning (2022)6.77