A Two-stage Full-band Speech Enhancement Model With Effective Spectral Compression Mapping
2022 Β· Zhongshu Hou, Qinwen Hu, Kai Chen, et al.
Abstract
The direct expansion of deep neural network (DNN) based wide-band speech enhancement (SE) to full-band processing faces the challenge of low frequency resolution in low frequency range, which would highly likely lead to deteriorated performance of the model. In this paper, we propose a learnable spectral compression mapping (SCM) to effectively compress the high frequency components so that they can be processed in a more efficient manner. By doing so, the model can pay more attention to low and middle frequency range, where most of the speech power is concentrated. Instead of suppressing noise in a single network structure, we first estimate a spectral magnitude mask, converting the speech to a high signal-to-ratio (SNR) state, and then utilize a subsequent model to further optimize the real and imaginary mask of the pre-enhanced signal. We conduct comprehensive experiments to validate the efficacy of the proposed method.
Authors
(none)
Tags
Stats
Related papers
- Dmf-net: A Decoupling-style Multi-band Fusion Model For Full-band Speech Enhancement (2022)7.16
- Multichannel Speech Enhancement By Raw Waveform-mapping Using Fully Convolutional Networks (2019)12.25
- FB-MSTCN: A Full-band Single-channel Speech Enhancement Method Based On Multi-scale Temporal Convolutional Network (2022)6.77
- Snr-progressive Model With Harmonic Compensation For Low-snr Speech Enhancement (2024)4.52
- High Fidelity Speech Enhancement With Band-split RNN (2022)0.00
- Dsp-informed Bandwidth Extension Using Locally-conditioned Excitation And Linear Time-varying Filter Subnetworks (2024)2.26
- S-DCCRN: Super Wide Band DCCRN With Learnable Complex Feature For Speech Enhancement (2021)11.93
- Deep Interaction Between Masking And Mapping Targets For Single-channel Speech Enhancement (2021)0.00