Dual-path Filter Network: Speaker-aware Modeling For Speech Separation
2021 Β· Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, et al.
Abstract
Speech separation has been extensively studied to deal with the cocktail party problem in recent years. All related approaches can be divided into two categories: time-frequency domain methods and time domain methods. In addition, some methods try to generate speaker vectors to support source separation. In this study, we propose a new model called dual-path filter network (DPFN). Our model focuses on the post-processing of speech separation to improve speech separation performance. DPFN is composed of two parts: the speaker module and the separation module. First, the speaker module infers the identities of the speakers. Then, the separation module uses the speakers' information to extract the voices of individual speakers from the mixture. DPFN constructed based on DPRNN-TasNet is not only superior to DPRNN-TasNet, but also avoids the problem of permutation-invariant training (PIT).
Authors
(none)
Tags
Stats
Related papers
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- Dual-path Transformer Network: Direct Context-aware Modeling For End-to-end Monaural Speech Separation (2020)18.24
- Speech Separation Based On Multi-stage Elaborated Dual-path Deep Bilstm With Auxiliary Identity Loss (2020)9.77
- Tasnet: Time-domain Audio Separation Network For Real-time, Single-channel Speech Separation (2017)20.16
- Multi-scale Feature Fusion Transformer Network For End-to-end Single Channel Speech Separation (2022)0.00
- Lafurca: Iterative Refined Speech Separation Based On Context-aware Dual-path Parallel Bi-lstm (2020)0.00
- Dual-path RNN: Efficient Long Sequence Modeling For Time-domain Single-channel Speech Separation (2019)21.06
- DPCCN: Densely-connected Pyramid Complex Convolutional Network For Robust Speech Separation And Extraction (2021)0.00