Dpt-fsnet: Dual-path Transformer Based Full-band And Sub-band Fusion Network For Speech Enhancement
2021 Β· Feng Dang, Hangting Chen, Pengyuan Zhang
Abstract
Sub-band models have achieved promising results due to their ability to model local patterns in the spectrogram. Some studies further improve the performance by fusing sub-band and full-band information. However, the structure for the full-band and sub-band fusion model was not fully explored. This paper proposes a dual-path transformer-based full-band and sub-band fusion network (DPT-FSNet) for speech enhancement in the frequency domain. The intra and inter parts of the dual-path transformer model sub-band and full-band information, respectively. The features utilized by our proposed method are more interpretable than those utilized by the time-domain dual-path transformer. We conducted experiments on the Voice Bank + DEMAND and Interspeech 2020 Deep Noise Suppression (DNS) datasets to evaluate the proposed method. Experimental results show that the proposed method outperforms the current state-of-the-art.
Authors
(none)
Tags
Stats
Related papers
- Fullsubnet: A Full-band And Sub-band Fusion Model For Real-time Single-channel Speech Enhancement (2020)17.09
- Speech Enhancement With Perceptually-motivated Optimization And Dual Transformations (2022)0.00
- Forknet: Simultaneous Time And Time-frequency Domain Modeling For Speech Enhancement (2023)0.00
- Fast Fullsubnet: Accelerate Full-band And Sub-band Fusion Model For Single-channel Speech Enhancement (2022)5.56
- Dmf-net: A Decoupling-style Multi-band Fusion Model For Full-band Speech Enhancement (2022)7.16
- Dbnet: A Dual-branch Network Architecture Processing On Spectrum And Waveform For Single-channel Speech Enhancement (2021)8.09
- Dual-path Transformer Network: Direct Context-aware Modeling For End-to-end Monaural Speech Separation (2020)18.24
- Efficient Encoder-decoder And Dual-path Conformer For Comprehensive Feature Learning In Speech Enhancement (2023)7.16