D2former: A Fully Complex Dual-path Dual-decoder Conformer Network Using Joint Complex Masking And Complex Spectral Mapping For Monaural Speech Enhancement
2023 Β· Shengkui Zhao, Bin Ma
Abstract
Monaural speech enhancement has been widely studied using real networks in the time-frequency (TF) domain. However, the input and the target are naturally complex-valued in the TF domain, a fully complex network is highly desirable for effectively learning the feature representation and modelling the sequence in the complex domain. Moreover, phase, an important factor for perceptual quality of speech, has been proved learnable together with magnitude from noisy speech using complex masking or complex spectral mapping. Many recent studies focus on either complex masking or complex spectral mapping, ignoring their performance boundaries. To address above issues, we propose a fully complex dual-path dual-decoder conformer network (D2Former) using joint complex masking and complex spectral mapping for monaural speech enhancement. In D2Former, we extend the conformer network into the complex domain and form a dual-path complex TF self-attention architecture for effectively modelling the com
Authors
(none)
Tags
Stats
Related papers
- Uformer: A Unet Based Dilated Complex & Real Dual-path Conformer Network For Simultaneous Speech Enhancement And Dereverberation (2021)12.87
- Df-conformer: Integrated Architecture Of Conv-tasnet And Conformer Using Linear Complexity Self-attention For Speech Enhancement (2021)11.29
- Deep Interaction Between Masking And Mapping Targets For Single-channel Speech Enhancement (2021)0.00
- Efficient Encoder-decoder And Dual-path Conformer For Comprehensive Feature Learning In Speech Enhancement (2023)7.16
- U-former: Improving Monaural Speech Enhancement With Multi-head Self And Cross Attention (2022)0.00
- Distortionless Multi-channel Target Speech Enhancement For Overlapped Speech Recognition (2020)0.00
- Dbt-net: Dual-branch Federative Magnitude And Phase Estimation With Attention-in-attention Transformer For Monaural Speech Enhancement (2022)12.47
- Phase-aware Speech Enhancement With Deep Complex U-net (2019)0.00