Rhr-net: A Residual Hourglass Recurrent Neural Network For Speech Enhancement
2019 Β· Jalal Abdulbaqi, Yue Gu, Ivan Marsic
Abstract
Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss. Previous work has overcome these issues by using convolutional networks to learn long-range temporal correlations across high-resolution waveforms. These models, however, are limited by memory-intensive dilated convolution and aliasing artifacts from upsampling. We introduce an end-to-end fully-recurrent hourglass-shaped neural network architecture with residual connections for waveform-based single-channel speech enhancement. Our model can efficiently capture long-range temporal dependencies by reducing the features resolution without information loss. Experimental results show that our model outperforms state-of-the-art approaches in six evaluation metrics.
Authors
(none)
Tags
Stats
Related papers
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)14.02
- Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks (2020)5.84
- Dynamic Gated Recurrent Neural Network For Compute-efficient Speech Enhancement (2024)8.35
- Constrained Convolutional-recurrent Networks To Improve Speech Quality With Low Impact On Recognition Accuracy (2018)5.24
- Raw Waveform Encoder With Multi-scale Globally Attentive Locally Recurrent Networks For End-to-end Speech Recognition (2021)0.00
- High Order Recurrent Neural Networks For Acoustic Modelling (2018)8.60
- Speech Enhancement With Wide Residual Networks In Reverberant Environments (2019)0.00
- Waveform Modeling And Generation Using Hierarchical Recurrent Neural Networks For Speech Bandwidth Extension (2018)12.99