DPCCN: Densely-connected Pyramid Complex Convolutional Network For Robust Speech Separation And Extraction
2021 Β· Jiangyu Han, Yanhua Long, Lukas Burget, et al.
Abstract
In recent years, a number of time-domain speech separation methods have been proposed. However, most of them are very sensitive to the environments and wide domain coverage tasks. In this paper, from the time-frequency domain perspective, we propose a densely-connected pyramid complex convolutional network, termed DPCCN, to improve the robustness of speech separation under complicated conditions. Furthermore, we generalize the DPCCN to target speech extraction (TSE) by integrating a new specially designed speaker encoder. Moreover, we also investigate the robustness of DPCCN to unsupervised cross-domain TSE tasks. A Mixture-Remix approach is proposed to adapt the target domain acoustic characteristics for fine-tuning the source model. We evaluate the proposed methods not only under noisy and reverberant in-domain condition, but also in clean but cross-domain conditions. Results show that for both speech separation and extraction, the DPCCN-based systems achieve significantly better per
Authors
(none)
Tags
Stats
Related papers
- Dual-path Filter Network: Speaker-aware Modeling For Speech Separation (2021)3.58
- DPCRN: Dual-path Convolution Recurrent Network For Single Channel Speech Enhancement (2021)14.35
- Furcanext: End-to-end Monaural Speech Separation With Dynamic Gated Dilated Temporal Convolutional Networks (2019)12.40
- PDPCRN: Parallel Dual-path CRN With Bi-directional Inter-branch Interactions For Multi-channel Speech Enhancement (2023)0.00
- Multi-channel Speech Separation Using Deep Embedding Model With Multilayer Bootstrap Networks (2019)0.00
- An Efficient Speech Separation Network Based On Recurrent Fusion Dilated Convolution And Channel Attention (2023)0.00
- Desnet: A Multi-channel Network For Simultaneous Speech Dereverberation, Enhancement And Separation (2020)9.59
- Deformable Temporal Convolutional Networks For Monaural Noisy Reverberant Speech Separation (2022)8.09