Label-context-dependent Internal Language Model Estimation For CTC
2025 · Zijian Yang, Minh-Nghia Phan, Ralf Schlüter, et al.
Abstract
Although connectionist temporal classification (CTC) has the label context independence assumption, it can still implicitly learn a context-dependent internal language model (ILM) due to modern powerful encoders. In this work, we investigate the implicit context dependency modeled in the ILM of CTC. To this end, we propose novel context-dependent ILM estimation methods for CTC based on knowledge distillation (KD) with theoretical justifications. Furthermore, we introduce two regularization methods for KD. We conduct experiments on Librispeech and TED-LIUM Release 2 datasets for in-domain and cross-domain evaluation, respectively. Experimental results show that context-dependent ILMs outperform the context-independent priors in cross-domain evaluation, indicating that CTC learns a context-dependent ILM. The proposed label-level KD with smoothing method surpasses other ILM estimation approaches, with more than 13% relative improvement in word error rate compared to shallow fusion.
Authors
(none)
Tags
Stats
Related papers
- Inter-kd: Intermediate Knowledge Distillation For Ctc-based Automatic Speech Recognition (2022)7.50
- Internal Language Model Estimation Through Explicit Context Vector Learning For Attention-based Encoder-decoder ASR (2022)7.50
- Multilingual Training And Cross-lingual Adaptation On Ctc-based Acoustic Model (2017)0.00
- Mask The Bias: Improving Domain-adaptive Generalization Of Ctc-based ASR With Internal Language Model Estimation (2023)3.58
- Efficient CTC Regularization Via Coarse Labels For End-to-end Speech Translation (2023)0.00
- BERT Meets CTC: New Formulation Of End-to-end Speech Recognition With Pre-trained Masked Language Model (2022)0.00
- Exploring In-context Learning Of Textless Speech Language Model For Speech Classification Tasks (2023)3.58
- CR-CTC: Consistency Regularization On CTC For Improved Speech Recognition (2024)6.30