Language-routing Mixture Of Experts For Multilingual And Code-switching Speech Recognition
2023 Β· Wenxuan Wang, Guodong Ma, Yuke Li, et al.
Abstract
Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made good progress in multilingual and code-switching ASR, but present huge computational complexity with the increase of supported languages. In this work, we propose a computation-efficient network named Language-Routing Mixture of Experts (LR-MoE) for multilingual and code-switching ASR. LR-MoE extracts language-specific representations through the Mixture of Language Experts (MLE), which is guided to learn by a frame-wise language routing mechanism. The weight-shared frame-level language identification (LID) network is jointly trained as the shared pre-router of each MoE layer. Experiments show that the proposed method significantly improves multilingual and code-switching speech recognition performances over baseline with comparable computational efficiency.
Authors
(none)
Tags
Stats
Related papers
- Building A Great Multi-lingual Teacher With Sparsely-gated Mixture Of Experts For Speech Recognition (2021)0.00
- Mole : Mixture Of Language Experts For Multi-lingual Automatic Speech Recognition (2023)9.41
- Speechmoe: Scaling To Large Acoustic Models With Dynamic Routing Mixture Of Experts (2021)10.97
- Ba-moe: Boundary-aware Mixture-of-experts Adapter For Code-switching Speech Recognition (2023)7.50
- Sc-moe: Switch Conformer Mixture Of Experts For Unified Streaming And Non-streaming Code-switching ASR (2024)6.77
- CAMEL: Cross-attention Enhanced Mixture-of-experts And Language Bias For Code-switching Speech Recognition (2024)0.00
- Hdmole: Mixture Of Lora Experts With Hierarchical Routing And Dynamic Thresholds For Fine-tuning Llm-based ASR Models (2024)8.09
- Lae-st-moe: Boosted Language-aware Encoder Using Speech Translation Auxiliary Task For E2E Code-switching ASR (2023)6.34