Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
2023 Β· Kristina Tesch, Timo Gerkmann
Abstract
In a multi-channel separation task with multiple speakers, we aim to recover all individual speech signals from the mixture. In contrast to single-channel approaches, which rely on the different spectro-temporal characteristics of the speech signals, multi-channel approaches should additionally utilize the different spatial locations of the sources for a more powerful separation especially when the number of sources increases. To enhance the spatial processing in a multi-channel source separation scenario, in this work, we propose a deep neural network (DNN) based spatially selective filter (SSF) that can be spatially steered to extract the speaker of interest by initializing a recurrent neural network layer with the target direction. We compare the proposed SSF with a common end-to-end direct separation (DS) approach trained using utterance-wise permutation invariant training (PIT), which only implicitly learns to perform spatial filtering. We show that the SSF has a clear advantage o
Authors
(none)
Tags
Stats
Related papers
- Temporal-spatial Neural Filter: Direction Informed End-to-end Multi-channel Target Speech Separation (2020)0.00
- Multi-channel Narrow-band Deep Speech Separation With Full-band Permutation Invariant Training (2021)9.41
- Enhancing End-to-end Multi-channel Speech Separation Via Spatial Feature Learning (2020)12.47
- End-to-end Multi-channel Speech Separation (2019)0.00
- Spatialnet: Extensively Learning Spatial Information For Multichannel Joint Speech Separation, Denoising And Dereverberation (2023)13.88
- Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation In Complex Domain (2021)11.85
- Efficient Integration Of Multi-channel Information For Speaker-independent Speech Separation (2020)0.00
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00