Leveraging Joint Spectral And Spatial Learning With MAMBA For Multichannel Speech Enhancement
2024 Β· Wenze Ren, Haibin Wu, Yi-Cheng Lin, et al.
Abstract
In multichannel speech enhancement, effectively capturing spatial and spectral information across different microphones is crucial for noise reduction. Traditional methods, such as CNN or LSTM, attempt to model the temporal dynamics of full-band and sub-band spectral and spatial features. However, these approaches face limitations in fully modeling complex temporal dependencies, especially in dynamic acoustic environments. To overcome these challenges, we modify the current advanced model McNet by introducing an improved version of Mamba, a state-space model, and further propose MCMamba. MCMamba has been completely reengineered to integrate full-band and narrow-band spatial information with sub-band and full-band spectral features, providing a more comprehensive approach to modeling spatial and spectral information. Our experimental results demonstrate that MCMamba significantly improves the modeling of spatial and spectral features in multichannel speech enhancement, outperforming McN
Authors
(none)
Tags
Stats
Related papers
- Mamba-seunet: Mamba Unet For Monaural Speech Enhancement (2024)7.16
- An Investigation Of Incorporating Mamba For Speech Enhancement (2024)13.70
- Multichannel Long-term Streaming Neural Speech Enhancement For Static And Moving Speakers (2024)16.05
- Improving Speech Enhancement By Cross- And Sub-band Processing With State Space Model (2025)3.58
- Schr\"odinger Bridge Mamba For One-step Speech Enhancement (2025)0.00
- U-mamba-net: A Highly Efficient Mamba-based U-net Style Network For Noisy And Reverberant Speech Separation (2024)4.52
- SSAMBA: Self-supervised Audio Representation Learning With Mamba State Space Model (2024)0.00
- Improving Dual-microphone Speech Enhancement By Learning Cross-channel Features With Multi-head Attention (2022)6.77