LI-TTA: Language Informed Test-time Adaptation For Automatic Speech Recognition
2024 Β· Eunseop Yoon, Hee Suk Yoon, John Harvill, et al.
Abstract
Test-Time Adaptation (TTA) has emerged as a crucial solution to the domain shift challenge, wherein the target environment diverges from the original training environment. A prime exemplification is TTA for Automatic Speech Recognition (ASR), which enhances model performance by leveraging output prediction entropy minimization as a self-supervision signal. However, a key limitation of this self-supervision lies in its primary focus on acoustic features, with minimal attention to the linguistic properties of the input. To address this gap, we propose Language Informed Test-Time Adaptation (LI-TTA), which incorporates linguistic insights during TTA for ASR. LI-TTA integrates corrections from an external language model to merge linguistic with acoustic information by minimizing the CTC loss from the correction alongside the standard TTA loss. With extensive experiments, we show that LI-TTA effectively improves the performance of TTA for ASR in various distribution shift situations.
Authors
(none)
Tags
Stats
Related papers
- SLM-TTA: A Framework For Test-time Adaptation Of Generative Spoken Language Models (2025)0.00
- Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation For Automatic Speech Recognition (2022)8.09
- Examining Test-time Adaptation For Personalized Child Speech Recognition (2024)0.00
- SUTA-LM: Bridging Test-time Adaptation And Language Model Rescoring For Robust ASR (2025)0.00
- Continual Test-time Adaptation For End-to-end Speech Recognition On Noisy Speech (2024)4.52
- Advancing Test-time Adaptation In Wild Acoustic Test Settings (2023)2.26
- EMO-TTA: Improving Test-time Adaptation Of Audio-language Models For Speech Emotion Recognition (2025)0.00
- Multiple Consistency-guided Test-time Adaptation For Contrastive Audio-language Models With Unlabeled Audio (2024)2.26