Lattice Rescoring Strategies For Long Short Term Memory Language Models In Speech Recognition
2017 Β· Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, et al.
Abstract
Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring algorithms using RNNLMs and LSTMLMs as an efficient strategy to integrate these models into a speech recognition system. In this paper, we evaluate existing lattice rescoring algorithms along with new variants on a YouTube speech recognition task. Lattice rescoring using LSTMLMs reduces the word error rate (WER) for this task by 8% relative to the WER obtained using an N-gram LM.
Authors
(none)
Tags
Stats
Related papers
- Context-aware RNNLM Rescoring For Conversational Speech Recognition (2020)4.52
- Discriminative Speech Recognition Rescoring With Pre-trained Language Models (2023)2.26
- Contextualizing ASR Lattice Rescoring With Hybrid Pointer Network Language Model (2020)8.09
- Speech Recognition With Llms Adapted To Disordered Speech Using Reinforcement Learning (2024)5.24
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition (2023)11.76
- Comparison Of Lattice-free And Lattice-based Sequence Discriminative Training Criteria For LVCSR (2019)5.84
- Future Word Contexts In Neural Network Language Models (2017)8.35
- Multi-stage Large Language Model Correction For Speech Recognition (2023)0.00