Lisennet: Lightweight Sub-band And Dual-path Modeling For Real-time Speech Enhancement
2024 Β· Haoyin Yan, Jie Zhang, Cunhang Fan, et al.
Abstract
Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility. Although learning-based methods can perform much better than traditional counterparts, the large computational complexity and model size heavily limit the deployment on latency-sensitive and low-resource edge devices. In this work, we propose a lightweight SE network (LiSenNet) for real-time applications. We design sub-band downsampling and upsampling blocks and a dual-path recurrent module to capture band-aware features and time-frequency patterns, respectively. A noise detector is developed to detect noisy regions in order to perform SE adaptively and save computational costs. Compared to recent higher-resource-dependent baseline models, the proposed LiSenNet can achieve a competitive performance with only 37k parameters (half of the state-of-the-art model) and 56M multiply-accumulate (MAC) operations per second.
Authors
(none)
Tags
Stats
Related papers
- Thlnet: Two-stage Heterogeneous Lightweight Network For Monaural Speech Enhancement (2023)0.00
- Cheapnet: Improving Light-weight Speech Enhancement Network By Projected Loss Function (2023)0.00
- Lmfca-net: A Lightweight Model For Multi-channel Speech Enhancement With Efficient Narrow-band And Cross-band Attention (2025)3.58
- Real-time Speech Frequency Bandwidth Extension (2020)12.54
- A Lightweight Dual-stage Framework For Personalized Speech Enhancement Based On Deepfilternet2 (2024)2.26
- Mp-senet: A Speech Enhancement Model With Parallel Denoising Of Magnitude And Phase Spectra (2023)15.51
- Fast Fullsubnet: Accelerate Full-band And Sub-band Fusion Model For Single-channel Speech Enhancement (2022)5.56
- Human Listening And Live Captioning: Multi-task Training For Speech Enhancement (2021)9.92