Unsupervised Speech Enhancement With Deep Dynamical Generative Speech And Noise Models
2023 Β· Xiaoyu Lin, Simon Leglaive, Laurent Girin, et al.
Abstract
This work builds on a previous work on unsupervised speech enhancement using a dynamical variational autoencoder (DVAE) as the clean speech model and non-negative matrix factorization (NMF) as the noise model. We propose to replace the NMF noise model with a deep dynamical generative model (DDGM) depending either on the DVAE latent variables, or on the noisy observations, or on both. This DDGM can be trained in three configurations: noise-agnostic, noise-dependent and noise adaptation after noise-dependent training. Experimental results show that the proposed method achieves competitive performance compared to state-of-the-art unsupervised speech enhancement methods, while the noise-dependent training configuration yields a much more time-efficient inference process.
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Speech Enhancement Using Dynamical Variational Auto-encoders (2021)13.28
- A Statistically Principled And Computationally Efficient Approach To Speech Enhancement Using Variational Autoencoders (2019)9.23
- A Recurrent Variational Autoencoder For Speech Enhancement (2019)13.97
- Statistical Speech Enhancement Based On Probabilistic Integration Of Variational Autoencoder And Non-negative Matrix Factorization (2017)15.00
- Dynamic Attention Based Generative Adversarial Network With Phase Post-processing For Speech Enhancement (2020)0.00
- Audio-visual Speech Enhancement With A Deep Kalman Filter Generative Model (2022)6.34
- Diffusion-based Unsupervised Audio-visual Speech Enhancement (2024)4.52
- SEGAN: Speech Enhancement Generative Adversarial Network (2017)21.85