Hdmole: Mixture Of Lora Experts With Hierarchical Routing And Dynamic Thresholds For Fine-tuning Llm-based ASR Models

Abstract

Recent advancements in integrating Large Language Models (LLM) with automatic speech recognition (ASR) have performed remarkably in general domains. While supervised fine-tuning (SFT) of all model parameters is often employed to adapt pre-trained LLM-based ASR models to specific domains, it imposes high computational costs and notably reduces their performance in general domains. In this paper, we propose a novel parameter-efficient multi-domain fine-tuning method for adapting pre-trained LLM-based ASR models to multi-accent domains without catastrophic forgetting named \textit\{HDMoLE\}, which leverages hierarchical routing and dynamic thresholds based on combining low-rank adaptation (LoRA) with the mixer of experts (MoE) and can be generalized to any linear layer. Hierarchical routing establishes a clear correspondence between LoRA experts and accent domains, improving cross-domain collaboration among the LoRA experts. Unlike the static Top-K strategy for activating LoRA experts, dy

Hdmole: Mixture Of Lora Experts With Hierarchical Routing And Dynamic Thresholds For Fine-tuning Llm-based ASR Models

Abstract

Authors

Tags

Stats

Related papers