E-BATS: Efficient Backpropagation-free Test-time Adaptation For Speech Foundation Models
2025 Β· Jiaheng Dong, Hong Jia, Soumyajit Chatterjee, et al.
Abstract
Speech Foundation Models encounter significant performance degradation when deployed in real-world scenarios involving acoustic domain shifts, such as background noise and speaker accents. Test-time adaptation (TTA) has recently emerged as a viable strategy to address such domain shifts at inference time without requiring access to source data or labels. However, existing TTA approaches, particularly those relying on backpropagation, are memory-intensive, limiting their applicability in speech tasks and resource-constrained settings. Although backpropagation-free methods offer improved efficiency, existing ones exhibit poor accuracy. This is because they are predominantly developed for vision tasks, which fundamentally differ from speech task formulations, noise characteristics, and model architecture, posing unique transferability challenges. In this paper, we introduce E-BATS, the first Efficient BAckpropagation-free TTA framework designed explicitly for speech foundation models. E-B
Authors
(none)
Tags
Stats
Related papers
- Advancing Test-time Adaptation In Wild Acoustic Test Settings (2023)2.26
- SLM-TTA: A Framework For Test-time Adaptation Of Generative Spoken Language Models (2025)0.00
- Examining Test-time Adaptation For Personalized Child Speech Recognition (2024)0.00
- EMO-TTA: Improving Test-time Adaptation Of Audio-language Models For Speech Emotion Recognition (2025)0.00
- LI-TTA: Language Informed Test-time Adaptation For Automatic Speech Recognition (2024)3.58
- Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation For Automatic Speech Recognition (2022)8.09
- Continual Test-time Adaptation For End-to-end Speech Recognition On Noisy Speech (2024)4.52
- Multiple Consistency-guided Test-time Adaptation For Contrastive Audio-language Models With Unlabeled Audio (2024)2.26