An Effective Mixture-of-experts Approach For Code-switching Speech Recognition Leveraging Encoder Disentanglement
2024 Β· Tzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, et al.
Abstract
With the massive developments of end-to-end (E2E) neural networks, recent years have witnessed unprecedented breakthroughs in automatic speech recognition (ASR). However, the codeswitching phenomenon remains a major obstacle that hinders ASR from perfection, as the lack of labeled data and the variations between languages often lead to degradation of ASR performance. In this paper, we focus exclusively on improving the acoustic encoder of E2E ASR to tackle the challenge caused by the codeswitching phenomenon. Our main contributions are threefold: First, we introduce a novel disentanglement loss to enable the lower-layer of the encoder to capture inter-lingual acoustic information while mitigating linguistic confusion at the higher-layer of the encoder. Second, through comprehensive experiments, we verify that our proposed method outperforms the prior-art methods using pretrained dual-encoders, meanwhile having access only to the codeswitching corpus and consuming half of the parameteri
Authors
(none)
Tags
Stats
Related papers
- Constrained Output Embeddings For End-to-end Code-switching Speech Recognition With Only Monolingual Data (2019)7.16
- On The End-to-end Solution To Mandarin-english Code-switching Speech Recognition (2018)12.10
- Towards End-to-end Code-switching Speech Recognition (2018)0.00
- Decoupling Pronunciation And Language For End-to-end Code-switching Automatic Speech Recognition (2020)0.00
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- End-to-end Code-switching ASR For Low-resourced Language Pairs (2019)9.76
- Integrating Knowledge In End-to-end Automatic Speech Recognition For Mandarin-english Code-switching (2021)5.24
- Transformer-transducers For Code-switched Speech Recognition (2020)10.97