Analysing Diffusion-based Generative Approaches Versus Discriminative Approaches For Speech Restoration
2022 Β· Jean-Marie Lemercier, Julius Richter, Simon Welker, et al.
Abstract
Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have recently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task of bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We observe that the ge
Authors
(none)
Tags
Stats
Related papers
- Speech Enhancement And Dereverberation With Diffusion-based Generative Models (2022)23.51
- Storm: A Diffusion-based Stochastic Regeneration Model For Speech Enhancement And Dereverberation (2022)15.43
- Investigating The Design Space Of Diffusion Models For Speech Enhancement (2023)10.07
- Extract And Diffuse: Latent Integration For Improved Diffusion-based Speech And Vocal Enhancement (2024)0.00
- Single And Few-step Diffusion For Generative Speech Enhancement (2023)10.21
- Diffusion-based Generative Modeling With Discriminative Guidance For Streamable Speech Enhancement (2024)7.16
- Adversarial Training Of Denoising Diffusion Model Using Dual Discriminators For High-fidelity Multi-speaker TTS (2023)2.26
- The Effect Of Training Dataset Size On Discriminative And Diffusion-based Speech Enhancement Systems (2024)6.77