Hdmole: Mixture Of Lora Experts With Hierarchical Routing And Dynamic Thresholds For Fine-tuning Llm-based ASR Models
2024 Β· Bingshen Mu, Kun Wei, Qijie Shao, et al.
Abstract
Recent advancements in integrating Large Language Models (LLM) with automatic speech recognition (ASR) have performed remarkably in general domains. While supervised fine-tuning (SFT) of all model parameters is often employed to adapt pre-trained LLM-based ASR models to specific domains, it imposes high computational costs and notably reduces their performance in general domains. In this paper, we propose a novel parameter-efficient multi-domain fine-tuning method for adapting pre-trained LLM-based ASR models to multi-accent domains without catastrophic forgetting named \textit\{HDMoLE\}, which leverages hierarchical routing and dynamic thresholds based on combining low-rank adaptation (LoRA) with the mixer of experts (MoE) and can be generalized to any linear layer. Hierarchical routing establishes a clear correspondence between LoRA experts and accent domains, improving cross-domain collaboration among the LoRA experts. Unlike the static Top-K strategy for activating LoRA experts, dy
Authors
(none)
Tags
Stats
Related papers
- Language-routing Mixture Of Experts For Multilingual And Code-switching Speech Recognition (2023)9.03
- Effective Text Adaptation For Llm-based ASR Through Soft Prompt Fine-tuning (2024)5.84
- Multimodal Large Language Models With Fusion Low Rank Adaptation For Device Directed Speech Detection (2024)0.00
- A Comprehensive Solution To Connect Speech Encoder And Large Language Model For ASR (2024)0.00
- MOSA: Mixtures Of Simple Adapters Outperform Monolithic Approaches In Llm-based Multilingual ASR (2025)0.00
- Speech Recognition With Llms Adapted To Disordered Speech Using Reinforcement Learning (2024)5.24
- Building A Great Multi-lingual Teacher With Sparsely-gated Mixture Of Experts For Speech Recognition (2021)0.00
- Multi-stage Large Language Model Correction For Speech Recognition (2023)0.00