Multi-channel Narrow-band Deep Speech Separation With Full-band Permutation Invariant Training
2021 Β· Changsheng Quan, Xiaofei Li
Abstract
This paper addresses the problem of multi-channel multi-speech separation based on deep learning techniques. In the short time Fourier transform domain, we propose an end-to-end narrow-band network that directly takes as input the multi-channel mixture signals of one frequency, and outputs the separated signals of this frequency. In narrow-band, the spatial information (or inter-channel difference) can well discriminate between speakers at different positions. This information is intensively used in many narrow-band speech separation methods, such as beamforming and clustering of spatial vectors. The proposed network is trained to learn a rule to automatically exploit this information and perform speech separation. Such a rule should be valid for any frequency, thence the network is shared by all frequencies. In addition, a full-band permutation invariant training criterion is proposed to solve the frequency permutation problem encountered by most narrow-band methods. Experiments show
Authors
(none)
Tags
Stats
Related papers
- End-to-end Multi-channel Speech Separation (2019)0.00
- Single-channel Speech Separation Using Soft-minimum Permutation Invariant Training (2021)2.26
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00
- Efficient Integration Of Multi-channel Information For Speaker-independent Speech Separation (2020)0.00
- End-to-end Speech Separation With Unfolded Iterative Phase Reconstruction (2018)15.00
- Cracking The Cocktail Party Problem By Multi-beam Deep Attractor Network (2018)9.92
- Furcanet: An End-to-end Deep Gated Convolutional, Long Short-term Memory, Deep Neural Networks For Single Channel Speech Separation (2019)0.00