Understanding Textual Capability Degradation In Speech Llms Via Parameter Importance Analysis
2025 Β· Chao Wang, Rui-Chen Zheng, Yang Ai, et al.
Abstract
The integration of speech into Large Language Models (LLMs) has substantially expanded their capabilities, but often at the cost of weakening their core textual competence. This degradation limits the ability of speech-enabled LLMs to fully exploit their pre-trained text-based knowledge. In this work, we analyze the underlying mechanisms of this issue through a focused study of the widely used encoder-adaptor paradigm. We propose an analytical framework based on parameter importance estimation, which reveals that fine-tuning for speech introduces a textual importance distribution shift: the layer-wise allocation of parameters critical to textual reasoning is disrupted. Building on this insight, we investigate two mitigation strategies: layer-wise learning rate scheduling and Low-Rank Adaptation (LoRA), both aim to preserve the original parameter distribution. Experimental results show that both approaches better maintain textual competence than full fine-tuning, while also improving do
Authors
(none)
Tags
Stats
Related papers
- Closing The Gap Between Text And Speech Understanding In Llms (2025)0.00
- Conversational Speech Reveals Structural Robustness Failures In Speechllm Backbones (2025)0.00
- Effective Text Adaptation For Llm-based ASR Through Soft Prompt Fine-tuning (2024)5.84
- Exploring Fine-tuning Of Large Audio Language Models For Spoken Language Understanding Under Limited Speech Data (2025)0.00
- A Comprehensive Solution To Connect Speech Encoder And Large Language Model For ASR (2024)0.00
- Efficient Emotion And Speaker Adaptation In Llm-based TTS Via Characteristic-specific Partial Fine-tuning (2025)0.00
- Investigating Decoder-only Large Language Models For Speech-to-text Translation (2024)0.00
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration (2023)0.00