A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder
2023 Β· Yang Xiang, Jingguang Tian, Xinhui Hu, et al.
Abstract
Generally, the performance of deep neural networks (DNNs) heavily depends on the quality of data representation learning. Our preliminary work has emphasized the significance of deep representation learning (DRL) in the context of speech enhancement (SE) applications. Specifically, our initial SE algorithm employed a gated recurrent unit variational autoencoder (VAE) with a Gaussian distribution to enhance the performance of certain existing SE systems. Building upon our preliminary framework, this paper introduces a novel approach for SE using deep complex convolutional recurrent networks with a VAE (DCCRN-VAE). DCCRN-VAE assumes that the latent variables of signals follow complex Gaussian distributions that are modeled by DCCRN, as these distributions can better capture the behaviors of complex signals. Additionally, we propose the application of a residual loss in DCCRN-VAE to further improve the quality of the enhanced speech. \{Compared to our preliminary work, DCCRN-VAE introduce
Authors
(none)
Tags
Stats
Related papers
- I-DCCRN-VAE: An Improved Deep Representation Learning Framework For Complex Vae-based Single-channel Speech Enhancement (2025)0.00
- Complex Recurrent Variational Autoencoder With Application To Speech Enhancement (2022)0.00
- DCCRGAN: Deep Complex Convolution Recurrent Generator Adversarial Network For Speech Enhancement (2020)0.00
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- A Bayesian Permutation Training Deep Representation Learning Method For Speech Enhancement With Variational Autoencoder (2022)7.16
- DCCRN+: Channel-wise Subband DCCRN With SNR Estimation For Speech Enhancement (2021)0.00
- Rethinking Complex-valued Deep Neural Networks For Monaural Speech Enhancement (2023)6.77
- S-DCCRN: Super Wide Band DCCRN With Learnable Complex Feature For Speech Enhancement (2021)11.93