FB-MSTCN: A Full-band Single-channel Speech Enhancement Method Based On Multi-scale Temporal Convolutional Network
2022 Β· Zehua Zhang, Lu Zhang, Xuyi Zhuang, et al.
Abstract
In recent years, deep learning-based approaches have significantly improved the performance of single-channel speech enhancement. However, due to the limitation of training data and computational complexity, real-time enhancement of full-band (48 kHz) speech signals is still very challenging. Because of the low energy of spectral information in the high-frequency part, it is more difficult to directly model and enhance the full-band spectrum using neural networks. To solve this problem, this paper proposes a two-stage real-time speech enhancement model with extraction-interpolation mechanism for a full-band signal. The 48 kHz full-band time-domain signal is divided into three sub-channels by extracting, and a two-stage processing scheme of `masking + compensation' is proposed to enhance the signal in the complex domain. After the two-stage enhancement, the enhanced full-band speech signal is restored by interval interpolation. In the subjective listening and word accuracy test, our pro
Authors
(none)
Tags
Stats
Related papers
- Dmf-net: A Decoupling-style Multi-band Fusion Model For Full-band Speech Enhancement (2022)7.16
- Fullsubnet+: Channel Attention Fullsubnet With Complex Spectrograms For Speech Enhancement (2022)15.10
- Monaural Speech Enhancement Using A Multi-branch Temporal Convolutional Network (2019)3.58
- TFCN: Temporal-frequential Convolutional Network For Single-channel Speech Enhancement (2022)0.00
- Lmfca-net: A Lightweight Model For Multi-channel Speech Enhancement With Efficient Narrow-band And Cross-band Attention (2025)3.58
- Distortionless Multi-channel Target Speech Enhancement For Overlapped Speech Recognition (2020)0.00
- Speech Enhancement Using Multi-stage Self-attentive Temporal Convolutional Networks (2021)14.15
- Narrow-band Deep Filtering For Multichannel Speech Enhancement (2019)0.00