Cleanumamba: A Compact Mamba Network For Speech Denoising Using Channel Pruning
2024 Β· Sjoerd Groot, Qinyu Chen, Jan C. van Gemert, et al.
Abstract
This paper presents CleanUMamba, a time-domain neural network architecture designed for real-time causal audio denoising directly applied to raw waveforms. CleanUMamba leverages a U-Net encoder-decoder structure, incorporating the Mamba state-space model in the bottleneck layer. By replacing conventional self-attention and LSTM mechanisms with Mamba, our architecture offers superior denoising performance while maintaining a constant memory footprint, enabling streaming operation. To enhance efficiency, we applied structured channel pruning, achieving an 8X reduction in model size without compromising audio quality. Our model demonstrates strong results in the Interspeech 2020 Deep Noise Suppression challenge. Specifically, CleanUMamba achieves a PESQ score of 2.42 and STOI of 95.1% with only 442K parameters and 468M MACs, matching or outperforming larger models in real-time performance. Code will be available at: https://github.com/lab-emi/CleanUMamba
Authors
(none)
Tags
Stats
Code
Related papers
- U-mamba-net: A Highly Efficient Mamba-based U-net Style Network For Noisy And Reverberant Speech Separation (2024)4.52
- Mamba-seunet: Mamba Unet For Monaural Speech Enhancement (2024)7.16
- Leveraging Joint Spectral And Spatial Learning With MAMBA For Multichannel Speech Enhancement (2024)0.00
- A Comparative Evaluation Of Deep Learning Models For Speech Enhancement In Real-world Noisy Environments (2025)0.00
- Schr\"odinger Bridge Mamba For One-step Speech Enhancement (2025)0.00
- Mp-senet: A Speech Enhancement Model With Parallel Denoising Of Magnitude And Phase Spectra (2023)15.51
- A Wavenet For Speech Denoising (2017)18.47
- Mamba2 Meets Silence: Robust Vocal Source Separation For Sparse Regions (2025)0.00