Margin Matters: Towards More Discriminative Deep Neural Network Embeddings For Speaker Recognition
2019 Β· Xu Xiang, Shuai Wang, Houjun Huang, et al.
Abstract
Recently, speaker embeddings extracted from a speaker discriminative deep neural network (DNN) yield better performance than the conventional methods such as i-vector. In most cases, the DNN speaker classifier is trained using cross entropy loss with softmax. However, this kind of loss function does not explicitly encourage inter-class separability and intra-class compactness. As a result, the embeddings are not optimal for speaker recognition tasks. In this paper, to address this issue, three different margin based losses which not only separate classes but also demand a fixed margin between classes are introduced to deep speaker embedding learning. It could be demonstrated that the margin is the key to obtain more discriminative speaker embeddings. Experiments are conducted on two public text independent tasks: VoxCeleb1 and Speaker in The Wild (SITW). The proposed approach can achieve the state-of-the-art performance, with 25% ~ 30% equal error rate (EER) reduction on both tasks whe
Authors
(none)
Tags
Stats
Related papers
- Large Margin Softmax Loss For Speaker Verification (2019)14.66
- Unified Hypersphere Embedding For Speaker Recognition (2018)0.00
- Challenging Margin-based Speaker Embedding Extractors By Using The Variational Information Bottleneck (2024)0.00
- On Deep Speaker Embeddings For Text-independent Speaker Recognition (2018)11.93
- Improved Large-margin Softmax Loss For Speaker Diarisation (2019)6.34
- A Comparative Re-assessment Of Feature Extractors For Deep Speaker Embeddings (2020)8.09
- A Study On Angular Based Embedding Learning For Text-independent Speaker Verification (2019)2.26
- Dr-vectors: Decision Residual Networks And An Improved Loss For Speaker Recognition (2021)8.60