Ensemble Of Jointly Trained Deep Neural Network-based Acoustic Models For Reverberant Speech Recognition
2016 Β· Jeehye Lee, Myungin Lee, Joon-Hyuk Chang
Abstract
Distant speech recognition is a challenge, particularly due to the corruption of speech signals by reverberation caused by large distances between the speaker and microphone. In order to cope with a wide range of reverberations in real-world situations, we present novel approaches for acoustic modeling including an ensemble of deep neural networks (DNNs) and an ensemble of jointly trained DNNs. First, multiple DNNs are established, each of which corresponds to a different reverberation time 60 (RT60) in a setup step. Also, each model in the ensemble of DNN acoustic models is further jointly trained, including both feature mapping and acoustic modeling, where the feature mapping is designed for the dereverberation as a front-end. In a testing phase, the two most likely DNNs are chosen from the DNN ensemble using maximum a posteriori (MAP) probabilities, computed in an online fashion by using maximum likelihood (ML)-based blind RT60 estimation and then the posterior probability outputs f
Authors
(none)
Tags
Stats
Related papers
- A Network Of Deep Neural Networks For Distant Speech Recognition (2017)10.35
- Deep Learning Based Dereverberation Of Temporal Envelopesfor Robust Speech Recognition (2020)5.84
- Convolutive Prediction For Monaural Speech Dereverberation And Noisy-reverberant Speaker Separation (2021)11.39
- On Combining Features For Single-channel Robust Speech Recognition In Reverberant Environments (2019)0.00
- Dereverberation Of Autoregressive Envelopes For Far-field Speech Recognition (2021)6.77
- Deep Learning For Distant Speech Recognition (2017)0.00
- 3-D Feature And Acoustic Modeling For Far-field Speech Recognition (2019)0.00
- Batch-normalized Joint Training For Dnn-based Distant Speech Recognition (2017)8.82