Learn Spelling From Teachers: Transferring Knowledge From Language Models To Sequence-to-sequence Speech Recognition
2019 Β· Ye Bai, Jiangyan Yi, Jianhua Tao, et al.
Abstract
Integrating an external language model into a sequence-to-sequence speech recognition system is non-trivial. Previous works utilize linear interpolation or a fusion network to integrate external language models. However, these approaches introduce external components, and increase decoding computation. In this paper, we instead propose a knowledge distillation based training approach to integrating external language models into a sequence-to-sequence model. A recurrent neural network language model, which is trained on large scale external text, generates soft labels to guide the sequence-to-sequence model training. Thus, the language model plays the role of the teacher. This approach does not add any external component to the sequence-to-sequence model during testing. And this approach is flexible to be combined with shallow fusion technique together for decoding. The experiments are conducted on public Chinese datasets AISHELL-1 and CLMAD. Our approach achieves a character error rate
Authors
(none)
Tags
Stats
Related papers
- An Analysis Of Incorporating An External Language Model Into A Sequence-to-sequence Model (2017)16.25
- Language Model Integration Based On Memory Control For Sequence To Sequence Speech Recognition (2018)2.26
- End-to-end Speech Translation With Knowledge Distillation (2019)0.00
- Knowledge Distillation From Language Model To Acoustic Model: A Hierarchical Multi-task Learning Approach (2021)3.58
- Knowledge Transfer From Large-scale Pretrained Language Models To End-to-end Speech Recognizers (2022)9.41
- On Language Model Integration For RNN Transducer Based Speech Recognition (2021)9.59
- Transfer Learning Of Language-independent End-to-end ASR With Language Model Fusion (2018)0.00
- Towards Better Decoding And Language Model Integration In Sequence To Sequence Models (2016)15.67