Adapt-and-adjust: Overcoming The Long-tail Problem Of Multilingual Speech Recognition
2020 Β· Genta Indra Winata, Guangsen Wang, Caiming Xiong, et al.
Abstract
One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts of limited training data. To overcome the long-tail problem, in this paper, we propose Adapt-and-Adjust (A2), a transformer-based multi-task learning framework for end-to-end multilingual speech recognition. The A2 framework overcomes the long-tail problem via three techniques: (1) exploiting a pretrained multilingual language model (mBERT) to improve the performance of low-resource languages; (2) proposing dual adapters consisting of both language-specific and language-agnostic adaptation with minimal additional parameters; and (3) overcoming the class imbalance, either by imposing class priors in the loss during training or adjusting the logits of the softmax output during inference. Extensive experiments on the CommonVoice corpus show tha
Authors
(none)
Tags
Stats
Related papers
- Massively Multilingual Adversarial Speech Recognition (2019)11.93
- Multi-staged Cross-lingual Acoustic Model Adaption For Robust Speech Recognition In Real-world Applications -- A Case Study On German Oral History Interviews (2020)0.00
- Adaptive Activation Network For Low Resource Multilingual Speech Recognition (2022)0.00
- Continual Learning For Monolingual End-to-end Automatic Speech Recognition (2021)7.16
- Transformer-transducers For Code-switched Speech Recognition (2020)10.97
- Multilingual End-to-end Speech Recognition With A Single Transformer On Low-resource Languages (2018)0.00
- Leveraging Parameter-efficient Transfer Learning For Multi-lingual Text-to-speech Adaptation (2024)0.00
- ADAPTERMIX: Exploring The Efficacy Of Mixture Of Adapters For Low-resource TTS Adaptation (2023)6.34