Scoring Of Large-margin Embeddings For Speaker Verification: Cosine Or PLDA?
2022 Β· Qiongqiong Wang, Kong Aik Lee, Tianchi Liu
Abstract
The emergence of large-margin softmax cross-entropy losses in training deep speaker embedding neural networks has triggered a gradual shift from parametric back-ends to a simpler cosine similarity measure for speaker verification. Popular parametric back-ends include the probabilistic linear discriminant analysis (PLDA) and its variants. This paper investigates the properties of margin-based cross-entropy losses leading to such a shift and aims to find scoring back-ends best suited for speaker verification. In addition, we revisit the pre-processing techniques which have been widely used in the past and assess their effectiveness on large-margin embeddings. Experiments on the state-of-the-art ECAPA-TDNN networks trained with various large-margin softmax cross-entropy losses show a substantial increment in intra-speaker compactness making the conventional PLDA superfluous. In this regard, we found that constraining the within-speaker covariance matrix could improve the performance of th
Authors
(none)
Tags
Stats
Related papers
- Large Margin Softmax Loss For Speaker Verification (2019)14.66
- Probabilistic Spherical Discriminant Analysis: An Alternative To PLDA For Length-normalized Embeddings (2022)6.77
- On Deep Speaker Embeddings For Text-independent Speaker Recognition (2018)11.93
- Improved Large-margin Softmax Loss For Speaker Diarisation (2019)6.34
- Analyzing Speaker Verification Embedding Extractors And Back-ends Under Language And Channel Mismatch (2022)0.00
- Multiobjective Optimization Training Of PLDA For Speaker Verification (2018)2.26
- Attention Back-end For Automatic Speaker Verification With Multiple Enrollment Utterances (2021)10.21
- Margin Matters: Towards More Discriminative Deep Neural Network Embeddings For Speaker Recognition (2019)15.25