Embedding Recurrent Layers With Dual-path Strategy In A Variant Of Convolutional Network For Speaker-independent Speech Separation
2022 Β· Xue Yang, Changchun Bao
Abstract
Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent neural network (RNN) to advanced transformer, have been designed sophistically to improve separation performance. However, the state-of-the-art models usually suffer from several flaws related to the computation, such as large model size, huge memory consumption and computational complexity. To find the balance between the performance and computational efficiency and to further explore the modeling ability of traditional network structure, we combine RNN and a newly proposed variant of convolutional network to cope with speech separation problem. By embedding two RNNs into basic block of this variant with the help of dual-path strategy, the proposed network can effectively learn the local information and global dependency. Besides, a four-staged struct
Authors
(none)
Tags
Stats
Related papers
- Dual-path RNN: Efficient Long Sequence Modeling For Time-domain Single-channel Speech Separation (2019)21.06
- Speech Separation Using An Asynchronous Fully Recurrent Convolutional Neural Network (2021)0.00
- Lafurca: Iterative Refined Speech Separation Based On Context-aware Dual-path Parallel Bi-lstm (2020)0.00
- Dual-path Filter Network: Speaker-aware Modeling For Speech Separation (2021)3.58
- Dualsep: A Light-weight Dual-encoder Convolutional Recurrent Network For Real-time In-car Speech Separation (2024)0.00
- Real-time Speech Enhancement And Separation With A Unified Deep Neural Network For Single/dual Talker Scenarios (2023)2.26
- Attention Is All You Need In Speech Separation (2020)20.59
- Single-channel Speech Separation With Auxiliary Speaker Embeddings (2019)0.00