A Bayesian Permutation Training Deep Representation Learning Method For Speech Enhancement With Variational Autoencoder
2022 · Yang Xiang, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, et al.
Abstract
Recently, variational autoencoder (VAE), a deep representation learning (DRL) model, has been used to perform speech enhancement (SE). However, to the best of our knowledge, current VAE-based SE methods only apply VAE to the model speech signal, while noise is modeled using the traditional non-negative matrix factorization (NMF) model. One of the most important reasons for using NMF is that these VAE-based methods cannot disentangle the speech and noise latent variables from the observed signal. Based on Bayesian theory, this paper derives a novel variational lower bound for VAE, which ensures that VAE can be trained in supervision, and can disentangle speech and noise latent variables from the observed signal. This means that the proposed method can apply the VAE to model both speech and noise signals, which is totally different from the previous VAE-based SE works. More specifically, the proposed DRL method can learn to impose speech and noise signal priors to different sets of laten
Authors
(none)
Tags
Stats
Related papers
- Investigation Of Speech And Noise Latent Representations In Single-channel Vae-based Speech Enhancement (2025)0.00
- A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder (2023)7.16
- A Statistically Principled And Computationally Efficient Approach To Speech Enhancement Using Variational Autoencoders (2019)9.23
- Statistical Speech Enhancement Based On Probabilistic Integration Of Variational Autoencoder And Non-negative Matrix Factorization (2017)15.00
- Can We Trust Deep Speech Prior? (2020)2.26
- I-DCCRN-VAE: An Improved Deep Representation Learning Framework For Complex Vae-based Single-channel Speech Enhancement (2025)0.00
- A Recurrent Variational Autoencoder For Speech Enhancement (2019)13.97
- Unsupervised Speech Enhancement Using Dynamical Variational Auto-encoders (2021)13.28