An Investigation Of Incorporating Mamba For Speech Enhancement
2024 Β· Rong Chao, Wen-Huang Cheng, Moreno La Quatra, et al.
Abstract
This work aims to investigate the use of a recently proposed, attention-free, scalable state-space model (SSM), Mamba, for the speech enhancement (SE) task. In particular, we employ Mamba to deploy different regression-based SE models (SEMamba) with different configurations, namely basic, advanced, causal, and non-causal. Furthermore, loss functions either based on signal-level distances or metric-oriented are considered. Experimental evidence shows that SEMamba attains a competitive PESQ of 3.55 on the VoiceBank-DEMAND dataset with the advanced, non-causal configuration. A new state-of-the-art PESQ of 3.69 is also reported when SEMamba is combined with Perceptual Contrast Stretching (PCS). Compared against Transformed-based equivalent SE solutions, a noticeable FLOPs reduction up to ~12% is observed with the advanced non-causal configurations. Finally, SEMamba can be used as a pre-processing step before automatic speech recognition (ASR), showing competitive performance against recent
Authors
(none)
Tags
Stats
Related papers
- Mamba-seunet: Mamba Unet For Monaural Speech Enhancement (2024)7.16
- Schr\"odinger Bridge Mamba For One-step Speech Enhancement (2025)0.00
- An Exploration Of Mamba For Speech Self-supervised Models (2025)1.20
- Leveraging Joint Spectral And Spatial Learning With MAMBA For Multichannel Speech Enhancement (2024)0.00
- Improving Speech Enhancement By Cross- And Sub-band Processing With State Space Model (2025)3.58
- Speech Slytherin: Examining The Performance And Efficiency Of Mamba For Speech Separation, Recognition, And Synthesis (2024)13.88
- Mamba-based Decoder-only Approach With Bidirectional Speech Modeling For Speech Recognition (2024)0.00
- Samba-asr: State-of-the-art Speech Recognition Leveraging Structured State-space Models (2025)0.00