End-to-end Residual CNN With L-GM Loss Speaker Verification System
2018 Β· Xuan Shi, Xingjian Du, Mengyao Zhu
Abstract
We propose an end-to-end speaker verification system based on the neural network and trained by a loss function with less computational complexity. The end-to-end speaker verification system in this paper consists of a ResNet architecture to extract features from utterance, then produces utterance-level speaker embeddings, and train using the large-margin Gaussian Mixture loss function. Influenced by the large-margin and likelihood regularization, large-margin Gaussian Mixture loss function benefits the speaker verification performance. Experimental results demonstrate that the Residual CNN with large-margin Gaussian Mixture loss outperforms DNN-based i-vector baseline by more than 10% improvement in accuracy rate.
Authors
(none)
Tags
Stats
Related papers
- Generalized End-to-end Loss For Speaker Verification (2017)20.58
- Large Margin Softmax Loss For Speaker Verification (2019)14.66
- Linear Regression For Speaker Verification (2018)0.00
- Dr-vectors: Decision Residual Networks And An Improved Loss For Speaker Recognition (2021)8.60
- Gmm-resnext: Combining Generative And Discriminative Models For Speaker Verification (2024)4.52
- Angular Softmax Loss For End-to-end Speaker Verification (2018)11.19
- End-to-end DNN Based Speaker Recognition Inspired By I-vector And PLDA (2017)10.35
- Generative Adversarial Speaker Embedding Networks For Domain Robust End-to-end Speaker Verification (2018)0.00