Internal Language Model Estimation Through Explicit Context Vector Learning For Attention-based Encoder-decoder ASR
2022 Β· Yufei Liu, Rao Ma, Haihua Xu, et al.
Abstract
An end-to-end (E2E) ASR model implicitly learns a prior Internal Language Model (ILM) from the training transcripts. To fuse an external LM using Bayes posterior theory, the log likelihood produced by the ILM has to be accurately estimated and subtracted. In this paper we propose two novel approaches to estimate the ILM based on Listen-Attend-Spell (LAS) framework. The first method is to replace the context vector of the LAS decoder at every time step with a vector that is learned with training transcripts. Furthermore, we propose another method that uses a lightweight feed-forward network to directly map query vector to context vector in a dynamic sense. Since the context vectors are learned by minimizing the perplexities on training transcripts, and their estimation is independent of encoder output, hence the ILMs are accurately learned for both methods. Experiments show that the ILMs achieve the lowest perplexity, indicating the efficacy of the proposed methods. In addition, they al
Authors
(none)
Tags
Stats
Related papers
- Investigating Methods To Improve Language Model Integration For Attention-based Encoder-decoder ASR Models (2021)0.00
- Internal Language Model Estimation For Domain-adaptive End-to-end Speech Recognition (2020)13.44
- Internal Language Model Training For Domain-adaptive End-to-end Speech Recognition (2021)11.39
- Internal Language Model Estimation Based Adaptive Language Model Fusion For Domain Adaptation (2022)0.00
- Independent Language Modeling Architecture For End-to-end ASR (2019)0.00
- Attention-based Contextual Language Model Adaptation For Speech Recognition (2021)0.00
- Mask The Bias: Improving Domain-adaptive Generalization Of Ctc-based ASR With Internal Language Model Estimation (2023)3.58
- Label-context-dependent Internal Language Model Estimation For CTC (2025)0.00