Exploring Gaussian Mixture Model Framework For Speaker Adaptation Of Deep Neural Network Acoustic Models
2020 Β· Natalia Tomashenko, Yuri Khokhlov, Yannick Esteve
Abstract
In this paper we investigate the GMM-derived (GMMD) features for adaptation of deep neural network (DNN) acoustic models. The adaptation of the DNN trained on GMMD features is done through the maximum a posteriori (MAP) adaptation of the auxiliary GMM model used for GMMD feature extraction. We explore fusion of the adapted GMMD features with conventional features, such as bottleneck and MFCC features, in two different neural network architectures: DNN and time-delay neural network (TDNN). We analyze and compare different types of adaptation techniques such as i-vectors and feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) with the proposed adaptation approach, and explore their complementarity using various types of fusion such as feature level, posterior level, lattice level and others in order to discover the best possible way of combination. Experimental results on the TED-LIUM corpus show that the proposed adaptation technique can be effectiv
Authors
(none)
Tags
Stats
Related papers
- Empirical Evaluation Of Speaker Adaptation On DNN Based Acoustic Model (2018)5.24
- Bayesian Learning For Deep Neural Network Adaptation (2020)9.76
- Generalized Domain Adaptation Framework For Parametric Back-end In Speaker Recognition (2023)0.00
- Cumulative Adaptation For BLSTM Acoustic Models (2019)0.00
- Speaker Adaptation Using Spectro-temporal Deep Features For Dysarthric And Elderly Speech Recognition (2022)12.02
- Unsupervised Model-based Speaker Adaptation Of End-to-end Lattice-free MMI Model For Speech Recognition (2022)2.26
- Multimodal Speech Synthesis Architecture For Unsupervised Speaker Adaptation (2018)6.34
- Linear Networks Based Speaker Adaptation For Speech Synthesis (2018)6.34