Emphasized Non-target Speaker Knowledge In Knowledge Distillation For Automatic Speaker Verification
2023 Β· Duc-Tuan Truong, Ruijie Tao, Jia Qi Yip, et al.
Abstract
Knowledge distillation (KD) is used to enhance automatic speaker verification performance by ensuring consistency between large teacher networks and lightweight student networks at the embedding level or label level. However, the conventional label-level KD overlooks the significant knowledge from non-target speakers, particularly their classification probabilities, which can be crucial for automatic speaker verification. In this paper, we first demonstrate that leveraging a larger number of training non-target speakers improves the performance of automatic speaker verification models. Inspired by this finding about the importance of non-target speakers' knowledge, we modified the conventional label-level KD by disentangling and emphasizing the classification probabilities of non-target speakers during knowledge distillation. The proposed method is applied to three different student model architectures and achieves an average of 13.67% improvement in EER on the VoxCeleb dataset compare
Authors
(none)
Tags
Stats
Related papers
- Integrated Multi-level Knowledge Distillation For Enhanced Speaker Verification (2024)0.00
- One-step Knowledge Distillation And Fine-tuning In Using Large Pre-trained Self-supervised Learning Models For Speaker Verification (2023)7.81
- VIC-KD: Variance-invariance-covariance Knowledge Distillation To Make Keyword Spotting More Robust Against Adversarial Attacks (2023)2.26
- Distilling Multi-level X-vector Knowledge For Small-footprint Speaker Verification (2023)0.00
- Leveraging ASR Pretrained Conformers For Speaker Verification Through Transfer Learning And Knowledge Distillation (2023)10.74
- Inter-kd: Intermediate Knowledge Distillation For Ctc-based Automatic Speech Recognition (2022)7.50
- Distil-dccrn: A Small-footprint DCCRN Leveraging Feature-based Knowledge Distillation In Speech Enhancement (2024)2.26
- Multi-level Knowledge Distillation For Speech Emotion Recognition In Noisy Conditions (2023)7.81