A Fully Recurrent Feature Extraction For Single Channel Speech Enhancement
2020 Β· Muhammed Pv Shifas, Santelli Claudio, Vassilis Tsiaras, et al.
Abstract
Convolutional neural network (CNN) modules are widely being used to build high-end speech enhancement neural models. However, the feature extraction power of vanilla CNN modules has been limited by the dimensionality constraint of the convolution kernels that are integrated - thereby, they have limitations to adequately model the noise context information at the feature extraction stage. To this end, adding recurrency factor into the feature extracting CNN layers, we introduce a robust context-aware feature extraction strategy for single-channel speech enhancement. As shown, adding recurrency results in capturing the local statistics of noise attributes at the extracted features level and thus, the suggested model is effective in differentiating speech cues even at very noisy conditions. When evaluated against enhancement models using vanilla CNN modules, in unseen noise conditions, the suggested model with recurrency in the feature extraction layers has produced a segmental SNR (SSNR)
Authors
(none)
Tags
Stats
Related papers
- FRCRN: Boosting Feature Representation Using Frequency Recurrence For Monaural Speech Enhancement (2022)22.16
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)14.02
- Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks (2020)5.84
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- Constrained Convolutional-recurrent Networks To Improve Speech Quality With Low Impact On Recognition Accuracy (2018)5.24
- A Dual-staged Context Aggregation Method Towards Efficient End-to-end Speech Enhancement (2019)0.00
- TFCN: Temporal-frequential Convolutional Network For Single-channel Speech Enhancement (2022)0.00
- Using Recurrences In Time And Frequency Within U-net Architecture For Speech Enhancement (2018)8.35