Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks
2020 Β· Jingdong Li, Hui Zhang, Xueliang Zhang, et al.
Abstract
In recent decades, neural network based methods have significantly improved the performace of speech enhancement. Most of them estimate time-frequency (T-F) representation of target speech directly or indirectly, then resynthesize waveform using the estimated T-F representation. In this work, we proposed the temporal convolutional recurrent network (TCRN), an end-to-end model that directly map noisy waveform to clean waveform. The TCRN, which is combined convolution and recurrent neural network, is able to efficiently and effectively leverage short-term ang long-term information. Futuremore, we present the architecture that repeatedly downsample and upsample speech during forward propagation. We show that our model is able to improve the performance of model, compared with existing convolutional recurrent networks. Futuremore, We present several key techniques to stabilize the training process. The experimental results show that our model consistently outperforms existing speech enhanc
Authors
(none)
Tags
Stats
Related papers
- TFCN: Temporal-frequential Convolutional Network For Single-channel Speech Enhancement (2022)0.00
- Monaural Speech Enhancement Using A Multi-branch Temporal Convolutional Network (2019)3.58
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)14.02
- Complex Spectral Mapping With Attention Based Convolution Recurrent Neural Network For Speech Enhancement (2021)0.00
- Residual Convolutional CTC Networks For Automatic Speech Recognition (2017)0.00
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- DCCRN+: Channel-wise Subband DCCRN With SNR Estimation For Speech Enhancement (2021)0.00
- Raw Waveform-based Speech Enhancement By Fully Convolutional Networks (2017)16.63