Gmm-resnext: Combining Generative And Discriminative Models For Speaker Verification
2024 Β· Hui Yan, Zhenchun Lei, Changhong Liu, et al.
Abstract
With the development of deep learning, many different network architectures have been explored in speaker verification. However, most network architectures rely on a single deep learning architecture, and hybrid networks combining different architectures have been little studied in ASV tasks. In this paper, we propose the GMM-ResNext model for speaker verification. Conventional GMM does not consider the score distribution of each frame feature over all Gaussian components and ignores the relationship between neighboring speech frames. So, we extract the log Gaussian probability features based on the raw acoustic features and use ResNext-based network as the backbone to extract the speaker embedding. GMM-ResNext combines Generative and Discriminative Models to improve the generalization ability of deep learning models and allows one to more easily specify meaningful priors on model parameters. A two-path GMM-ResNext model based on two gender-related GMMs has also been proposed. The Expe
Authors
(none)
Tags
Stats
Related papers
- Gmm-resnet2: Ensemble Of Group Resnet Networks For Synthetic Speech Detection (2024)7.16
- Coupling A Generative Model With A Discriminative Learning Framework For Speaker Verification (2021)5.24
- Generative Adversarial Speaker Embedding Networks For Domain Robust End-to-end Speaker Verification (2018)0.00
- End-to-end Residual CNN With L-GM Loss Speaker Verification System (2018)2.26
- Investigation Of Frame Alignments For Gmm-based Digit-prompted Speaker Verification (2017)4.52
- Deep Neural Network Embeddings With Gating Mechanisms For Text-independent Speaker Verification (2019)8.82
- Siamese Neural Network With Joint Bayesian Model Structure For Speaker Verification (2021)0.00
- Dr-vectors: Decision Residual Networks And An Improved Loss For Speaker Recognition (2021)8.60