Diffusion-based Speech Enhancement With Schr\"odinger Bridge And Symmetric Noise Schedule
2024 Β· Siyi Wang, Siyi Liu, Andrew Harper, et al.
Abstract
Recently, diffusion-based generative models have demonstrated remarkable performance in speech enhancement tasks. However, these methods still encounter challenges, including the lack of structural information and poor performance in low Signal-to-Noise Ratio (SNR) scenarios. To overcome these challenges, we propose the Schr\"oodinger Bridge-based Speech Enhancement (SBSE) method, which learns the diffusion processes directly between the noisy input and the clean distribution, unlike conventional diffusion-based speech enhancement systems that learn data to Gaussian distributions. To enhance performance in extremely noisy conditions, we introduce a two-stage system incorporating ratio mask information into the diffusion-based generative model. Our experimental results show that our proposed SBSE method outperforms all the baseline models and achieves state-of-the-art performance, especially in low SNR conditions. Importantly, only a few inference steps are required to achieve the best
Authors
(none)
Tags
Stats
Related papers
- Robust Speech Recognition With Schr\"odinger Bridge-based Speech Enhancement (2025)2.26
- FNSE-SBGAN: Far-field Speech Enhancement With Schrodinger Bridge And Generative Adversarial Networks (2025)3.58
- Speech Enhancement And Dereverberation With Diffusion-based Generative Models (2022)23.51
- Schrodinger Bridges Beat Diffusion Models On Text-to-speech Synthesis (2023)0.00
- Noise-aware Speech Enhancement Using Diffusion Probabilistic Model (2023)8.82
- Investigating Training Objectives For Generative Speech Enhancement (2024)9.76
- Diffusion-based Speech Enhancement With A Weighted Generative-supervised Learning Loss (2023)0.00
- Storm: A Diffusion-based Stochastic Regeneration Model For Speech Enhancement And Dereverberation (2022)15.43