Continual Learning In Machine Speech Chain Using Gradient Episodic Memory
2024 Β· Geoffrey Tyndall, Kurniawati Azizah, Dipta Tanaya, et al.
Abstract
Continual learning for automatic speech recognition (ASR) systems poses a challenge, especially with the need to avoid catastrophic forgetting while maintaining performance on previously learned tasks. This paper introduces a novel approach leveraging the machine speech chain framework to enable continual learning in ASR using gradient episodic memory (GEM). By incorporating a text-to-speech (TTS) component within the machine speech chain, we support the replay mechanism essential for GEM, allowing the ASR model to learn new tasks sequentially without significant performance degradation on earlier tasks. Our experiments, conducted on the LJ Speech dataset, demonstrate that our method outperforms traditional fine-tuning and multitask learning approaches, achieving a substantial error rate reduction while maintaining high performance across varying noise conditions. We showed the potential of our semi-supervised machine speech chain approach for effective and efficient continual learning
Authors
(none)
Tags
Stats
Related papers
- SGEM: Test-time Adaptation For Automatic Speech Recognition Via Sequential-level Generalized Entropy Minimization (2023)6.77
- Continual Learning For Monolingual End-to-end Automatic Speech Recognition (2021)7.16
- Continuously Learning New Words In Automatic Speech Recognition (2024)0.00
- Tokenchain: A Discrete Speech Chain Via Semantic Token Modeling (2025)0.00
- G2G: Tts-driven Pronunciation Learning For Graphemic Hybrid ASR (2019)8.35
- Rehearsal-free Online Continual Learning For Automatic Speech Recognition (2023)5.24
- Listening While Speaking And Visualizing: Improving ASR Through Multimodal Chain (2019)4.52
- Continual Speaker Adaptation For Text-to-speech Synthesis (2021)0.00