Inplace Gated Convolutional Recurrent Neural Network For Dual-channel Speech Enhancement
2021 Β· Jinjiang Liu, Xueliang Zhang
Abstract
For dual-channel speech enhancement, it is a promising idea to design an end-to-end model based on the traditional array signal processing guideline and the manifold space of multi-channel signals. We found that the idea above can be effectively implemented by the classical convolutional recurrent neural networks (CRN) architecture. We propose a very compact in place gated convolutional recurrent neural network (inplace GCRN) for end-to-end multi-channel speech enhancement, which utilizes inplace-convolution for frequency pattern extraction and reconstruction. The inplace characteristics efficiently preserve spatial cues in each frequency bin for channel-wise long short-term memory neural networks (LSTM) tracing the spatial source. In addition, we come up with a new spectrum recovery method by predict amplitude mask, mapping, and phase, which effectively improves the speech quality.
Authors
(none)
Tags
Stats
Related papers
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)14.02
- Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks (2020)5.84
- DPCRN: Dual-path Convolution Recurrent Network For Single Channel Speech Enhancement (2021)14.35
- SICRN: Advancing Speech Enhancement Through State Space Model And Inplace Convolution Techniques (2024)7.81
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- Multi-channel End-to-end Neural Network For Speech Enhancement, Source Localization, And Voice Activity Detection (2022)0.00
- Furcanet: An End-to-end Deep Gated Convolutional, Long Short-term Memory, Deep Neural Networks For Single Channel Speech Separation (2019)0.00
- PDPCRN: Parallel Dual-path CRN With Bi-directional Inter-branch Interactions For Multi-channel Speech Enhancement (2023)0.00