Improved End-to-end Dysarthric Speech Recognition Via Meta-learning Based Model Re-initialization
2020 Β· Disong Wang, Jianwei Yu, Xixin Wu, et al.
Abstract
Dysarthric speech recognition is a challenging task as dysarthric data is limited and its acoustics deviate significantly from normal speech. Model-based speaker adaptation is a promising method by using the limited dysarthric speech to fine-tune a base model that has been pre-trained from large amounts of normal speech to obtain speaker-dependent models. However, statistic distribution mismatches between the normal and dysarthric speech data limit the adaptation performance of the base model. To address this problem, we propose to re-initialize the base model via meta-learning to obtain a better model initialization. Specifically, we focus on end-to-end models and extend the model-agnostic meta learning (MAML) and Reptile algorithms to meta update the base model by repeatedly simulating adaptation to different dysarthric speakers. As a result, the re-initialized model acquires dysarthric speech knowledge and learns how to perform fast adaptation to unseen dysarthric speakers with impr
Authors
(none)
Tags
Stats
Related papers
- Learning To Adapt: A Meta-learning Approach For Speaker Adaptation (2018)9.76
- The Universal Personalizer: Few-shot Dysarthric Speech Recognition Via Meta-learning (2025)0.00
- Meta Learning For End-to-end Low-resource Speech Recognition (2019)0.00
- Enhancing Dysarthric Speech Recognition For Unseen Speakers Via Prototype-based Adaptation (2024)9.45
- Meta-tts: Meta-learning For Few-shot Speaker Adaptive Text-to-speech (2021)12.74
- Data Efficient Direct Speech-to-text Translation With Modality Agnostic Meta-learning (2019)0.00
- Optimal Transport-based Adaptation In Dysarthric Speech Tasks (2021)0.00
- Speaker Identity Preservation In Dysarthric Speech Reconstruction By Adversarial Speaker Adaptation (2022)0.00