A Recurrent Variational Autoencoder For Speech Enhancement
2019 Β· Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, et al.
Abstract
This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix factorization noise model for speech enhancement. We propose a variational expectation-maximization algorithm where the encoder of the RVAE is fine-tuned at test time, to approximate the distribution of the latent variables given the noisy speech observations. Compared with previous approaches based on feed-forward fully-connected architectures, the proposed recurrent deep generative speech model induces a posterior temporal dynamic over the latent variables, which is shown to improve the speech enhancement results.
Authors
(none)
Tags
Stats
Related papers
- Posterior Sampling Algorithms For Unsupervised Speech Enhancement With Recurrent Variational Autoencoder (2023)0.00
- A Statistically Principled And Computationally Efficient Approach To Speech Enhancement Using Variational Autoencoders (2019)9.23
- RVAE-EM: Generative Speech Dereverberation Based On Recurrent Variational Auto-encoder And Convolutive Transfer Function (2023)7.50
- Statistical Speech Enhancement Based On Probabilistic Integration Of Variational Autoencoder And Non-negative Matrix Factorization (2017)15.00
- Unsupervised Speech Enhancement Using Dynamical Variational Auto-encoders (2021)13.28
- Audio-visual Speech Enhancement Using Conditional Variational Auto-encoders (2019)13.65
- Complex Recurrent Variational Autoencoder With Application To Speech Enhancement (2022)0.00
- Unsupervised Speech Enhancement With Deep Dynamical Generative Speech And Noise Models (2023)0.00