DCCRGAN: Deep Complex Convolution Recurrent Generator Adversarial Network For Speech Enhancement
2020 Β· Huixiang Huang, Renjie Wu, Jingbiao Huang, et al.
Abstract
Generative adversarial network (GAN) still exists some problems in dealing with speech enhancement (SE) task. Some GAN-based systems adopt the same structure from Pixel-to-Pixel directly without special optimization. The importance of the generator network has not been fully explored. Other related researches change the generator network but operate in the time-frequency domain, which ignores the phase mismatch problem. In order to solve these problems, a deep complex convolution recurrent GAN (DCCRGAN) structure is proposed in this paper. The complex module builds the correlation between magnitude and phase of the waveform and has been proved to be effective. The proposed structure is trained in an end-to-end way. Different LSTM layers are used in the generator network to sufficiently explore the speech enhancement performance of DCCRGAN. The experimental results confirm that the proposed DCCRGAN outperforms the state-of-the-art GAN-based SE systems.
Authors
(none)
Tags
Stats
Related papers
- SEGAN: Speech Enhancement Generative Adversarial Network (2017)21.85
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- Tdcgan: Temporal Dilated Convolutional Generative Adversarial Network For End-to-end Speech Enhancement (2020)0.00
- Dynamic Attention Based Generative Adversarial Network With Phase Post-processing For Speech Enhancement (2020)0.00
- A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder (2023)7.16
- Conditional Generative Adversarial Networks For Speech Enhancement And Noise-robust Speaker Verification (2017)16.03
- Investigating Generative Adversarial Networks Based Speech Dereverberation For Robust Speech Recognition (2018)10.74
- CMGAN: Conformer-based Metric GAN For Speech Enhancement (2022)15.13