High Fidelity Speech Enhancement With Band-split RNN
2022 Β· Jianwei Yu, Yi Luo, Hangting Chen, et al.
Abstract
Despite the rapid progress in speech enhancement (SE) research, enhancing the quality of desired speech in environments with strong noise and interfering speakers remains challenging. In this paper, we extend the application of the recently proposed band-split RNN (BSRNN) model to full-band SE and personalized SE (PSE) tasks. To mitigate the effects of unstable high-frequency components in full-band speech, we perform bi-directional and uni-directional band-level modeling to low-frequency and high-frequency subbands, respectively. For PSE task, we incorporate a speaker enrollment module into BSRNN to utilize target speaker information. Moreover, we utilize a MetricGAN discriminator (MGD) and a multi-resolution spectrogram discriminator (MRSD) to improve perceptual quality metrics. Experimental results show that our system outperforms various top-ranking SE systems, achieves state-of-the-art (SOTA) results on the DNS-2020 test set and ranks among the top 3 in the DNS-2023 challenge.
Authors
(none)
Tags
Stats
Related papers
- Personalized Speech Enhancement Combining Band-split RNN And Speaker Attentive Module (2023)0.00
- Bridging The Gap: Integrating Pre-trained Speech Enhancement And Recognition Models For Robust Speech Recognition (2024)7.50
- Magnitude-phase Dual-path Speech Enhancement Network Based On Self-supervised Embedding And Perceptual Contrast Stretch Boosting (2025)3.21
- Lisennet: Lightweight Sub-band And Dual-path Modeling For Real-time Speech Enhancement (2024)9.03
- Reinforcement Learning Based Speech Enhancement For Robust Speech Recognition (2018)11.08
- A Two-stage Full-band Speech Enhancement Model With Effective Spectral Compression Mapping (2022)0.00
- FNSE-SBGAN: Far-field Speech Enhancement With Schrodinger Bridge And Generative Adversarial Networks (2025)3.58
- Snr-progressive Model With Harmonic Compensation For Low-snr Speech Enhancement (2024)4.52