Gated Recurrent Unit Based Acoustic Modeling With Future Context
2018 Β· Jie Li, Xiaorui Wang, Yuanyuan Zhao, et al.
Abstract
The use of future contextual information is typically shown to be helpful for acoustic modeling. However, for the recurrent neural network (RNN), it's not so easy to model the future temporal context effectively, meanwhile keep lower model latency. In this paper, we attempt to design a RNN acoustic model that being capable of utilizing the future context effectively and directly, with the model latency and computation cost as low as possible. The proposed model is based on the minimal gated recurrent unit (mGRU) with an input projection layer inserted in it. Two context modules, temporal encoding and temporal convolution, are specifically designed for this architecture to model the future context. Experimental results on the Switchboard task and an internal Mandarin ASR task show that, the proposed model performs much better than long short-term memory (LSTM) and mGRU models, whereas enables online decoding with a maximum latency of 170 ms. This model even outperforms a very strong bas
Authors
(none)
Tags
Stats
Related papers
- Future Word Contexts In Neural Network Language Models (2017)8.35
- Light Gated Recurrent Units For Speech Recognition (2018)18.90
- Frame Stacking And Retaining For Recurrent Neural Network Acoustic Model (2017)0.00
- Memory Visualization For Gated Recurrent Neural Networks In Speech Recognition (2016)11.76
- Improving RNN-T ASR Accuracy Using Context Audio (2020)5.84
- High Order Recurrent Neural Networks For Acoustic Modelling (2018)8.60
- Twin Regularization For Online Speech Recognition (2018)6.34
- Improving Speech Recognition By Revising Gated Recurrent Units (2017)11.19