Dynamically Slimmable Speech Enhancement Network With Metric-guided Training
2025 Β· Haixin Zhao, Kaixuan Yang, Nilesh Madhu
Abstract
To further reduce the complexity of lightweight speech enhancement models, we introduce a gating-based Dynamically Slimmable Network (DSN). The DSN comprises static and dynamic components. For architecture-independent applicability, we introduce distinct dynamic structures targeting the commonly used components, namely, grouped recurrent neural network units, multi-head attention, convolutional, and fully connected layers. A policy module adaptively governs the use of dynamic parts at a frame-wise resolution according to the input signal quality, controlling computational load. We further propose Metric-Guided Training (MGT) to explicitly guide the policy module in assessing input speech quality. Experimental results demonstrate that the DSN achieves comparable enhancement performance in instrumental metrics to the state-of-the-art lightweight baseline, while using only 73% of its computational load on average. Evaluations of dynamic component usage ratios indicate that the MGT-DSN can
Authors
(none)
Tags
Stats
Related papers
- Dynamic Gated Recurrent Neural Network For Compute-efficient Speech Enhancement (2024)8.35
- Dense-tsnet: Dense Connected Two-stage Structure For Ultra-lightweight Speech Enhancement (2024)0.00
- Unsupervised Speech Enhancement With Deep Dynamical Generative Speech And Noise Models (2023)0.00
- Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired By Dynamic Neural Network (2024)0.00
- Lightweight Speech Enhancement Guided Target Speech Extraction In Noisy Multi-speaker Scenarios (2025)0.00
- Multi-cmgan+/+: Leveraging Multi-objective Speech Quality Metric Prediction For Speech Enhancement (2023)0.00
- Ednet: A Versatile Speech Enhancement Framework With Gating Mamba Mechanism And Phase Shift-invariant Training (2025)0.00
- Incorporating Multi-target In Multi-stage Speech Enhancement Model For Better Generalization (2021)0.00