On The Relation Between Internal Language Model And Sequence Discriminative Training For Neural Transducers
2023 · Zijian Yang, Wei Zhou, Ralf Schlüter, et al.
Abstract
Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer with external language model (LM) fusion for speech recognition. In this work, we show that sequence discriminative training has a strong correlation with ILM subtraction from both theoretical and empirical points of view. Theoretically, we derive that the global optimum of maximum mutual information (MMI) training shares a similar formula as ILM subtraction. Empirically, we show that ILM subtraction and sequence discriminative training achieve similar effects across a wide range of experiments on Librispeech, including both MMI and minimum Bayes risk (MBR) criteria, as well as neural transducers and LMs of both full and limited context. The benefit of ILM subtraction also becomes much smaller after sequence discriminative training. We also provide an in-depth study to show that sequence discriminative training has a minimal effect on the commonly used zero-encoder ILM est
Authors
(none)
Tags
Stats
Related papers
- On Language Model Integration For RNN Transducer Based Speech Recognition (2021)9.59
- Internal Language Model Training For Domain-adaptive End-to-end Speech Recognition (2021)11.39
- A Comparison Of Lattice-free Discriminative Training Criteria For Purely Sequence-trained Neural Network Acoustic Models (2018)4.52
- An Empirical Study Of Language Model Integration For Transducer Based Speech Recognition (2022)3.58
- Internal Language Model Estimation For Domain-adaptive End-to-end Speech Recognition (2020)13.44
- Improved Neural Language Model Fusion For Streaming Recurrent Neural Network Transducer (2020)8.82
- Internal Language Model Estimation Through Explicit Context Vector Learning For Attention-based Encoder-decoder ASR (2022)7.50
- An Analysis Of Incorporating An External Language Model Into A Sequence-to-sequence Model (2017)16.25