Dmf-net: A Decoupling-style Multi-band Fusion Model For Full-band Speech Enhancement
2022 Β· Guochen Yu, Yuansheng Guan, Weixin Meng, et al.
Abstract
For the difficulty and large computational complexity of modeling more frequency bands, full-band speech enhancement based on deep neural networks is still challenging. Previous studies usually adopt compressed full-band speech features in Bark and ERB scale with relatively low frequency resolution, leading to degraded performance, especially in the high-frequency region. In this paper, we propose a decoupling-style multi-band fusion model to perform full-band speech denoising and dereverberation. Instead of optimizing the full-band speech by a single network structure, we decompose the full-band target into multi sub-band speech features and then employ a multi-stage chain optimization strategy to estimate clean spectrum stage by stage. Specifically, the low- (0-8 kHz), middle- (8-16 kHz), and high-frequency (16-24 kHz) regions are mapped by three separate sub-networks and are then fused to obtain the full-band clean target STFT spectrum. Comprehensive experiments on two public datase
Authors
(none)
Tags
Stats
Related papers
- Fullsubnet: A Full-band And Sub-band Fusion Model For Real-time Single-channel Speech Enhancement (2020)17.09
- FB-MSTCN: A Full-band Single-channel Speech Enhancement Method Based On Multi-scale Temporal Convolutional Network (2022)6.77
- Dpt-fsnet: Dual-path Transformer Based Full-band And Sub-band Fusion Network For Speech Enhancement (2021)0.00
- A Two-stage Full-band Speech Enhancement Model With Effective Spectral Compression Mapping (2022)0.00
- Fast Fullsubnet: Accelerate Full-band And Sub-band Fusion Model For Single-channel Speech Enhancement (2022)5.56
- Dbnet: A Dual-branch Network Architecture Processing On Spectrum And Waveform For Single-channel Speech Enhancement (2021)8.09
- Lmfca-net: A Lightweight Model For Multi-channel Speech Enhancement With Efficient Narrow-band And Cross-band Attention (2025)3.58
- Forknet: Simultaneous Time And Time-frequency Domain Modeling For Speech Enhancement (2023)0.00