3D Neural Beamforming For Multi-channel Speech Separation Against Location Uncertainty
2023 Β· Rongzhi Gu, Shi-Xiong Zhang, Dong Yu
Abstract
Multi-channel speech separation using speaker's directional information has demonstrated significant gains over blind speech separation. However, it has two limitations. First, substantial performance degradation is observed when the coming directions of two sounds are close. Second, the result highly relies on the precise estimation of the speaker's direction. To overcome these issues, this paper proposes 3D features and an associated 3D neural beamformer for multi-channel speech separation. Previous works in this area are extended in two important directions. First, the traditional 1D directional beam patterns are generalized to 3D. This enables the model to extract speech from any target region in the 3D space. Thus, speakers with similar directions but different elevations or distances become separable. Second, to handle the speaker location uncertainty, previously proposed spatial feature is extended to a new 3D region feature. The proposed 3D region feature and 3D neural beamform
Authors
(none)
Tags
Stats
Related papers
- Locate And Beamform: Two-dimensional Locating All-neural Beamformer For Multi-channel Speech Separation (2023)3.58
- Sequential Multi-frame Neural Beamforming For Speech Separation And Enhancement (2019)0.00
- Towards Unified All-neural Beamforming For Time And Frequency Domain Speech Separation (2022)11.29
- Mimo-dbnet: Multi-channel Input And Multiple Outputs Doa-aware Beamforming Network For Speech Separation (2022)0.00
- Multichannel Loss Function For Supervised Speech Source Separation By Mask-based Beamforming (2019)7.50
- Dual-path Transformer Based Neural Beamformer For Target Speech Extraction (2023)0.00
- Embedding And Beamforming: All-neural Causal Beamformer For Multichannel Speech Enhancement (2021)13.05
- Enhanced Neural Beamformer With Spatial Information For Target Speech Extraction (2023)2.26