Dual-path Transformer Based Neural Beamformer For Target Speech Extraction
2023 Β· Aoqi Guo, Sichong Qian, Baoxiang Li, et al.
Abstract
Neural beamformers, which integrate both pre-separation and beamforming modules, have demonstrated impressive effectiveness in target speech extraction. Nevertheless, the performance of these beamformers is inherently limited by the predictive accuracy of the pre-separation module. In this paper, we introduce a neural beamformer supported by a dual-path transformer. Initially, we employ the cross-attention mechanism in the time domain to extract crucial spatial information related to beamforming from the noisy covariance matrix. Subsequently, in the frequency domain, the self-attention mechanism is employed to enhance the model's ability to process frequency-specific details. By design, our model circumvents the influence of pre-separation modules, delivering performance in a more comprehensive end-to-end manner. Experimental results reveal that our model not only outperforms contemporary leading neural beamforming algorithms in separation performance but also achieves this with a sign
Authors
(none)
Tags
Stats
Related papers
- Enhanced Neural Beamformer With Spatial Information For Target Speech Extraction (2023)2.26
- Locate And Beamform: Two-dimensional Locating All-neural Beamformer For Multi-channel Speech Separation (2023)3.58
- Sequential Multi-frame Neural Beamforming For Speech Separation And Enhancement (2019)0.00
- Towards Unified All-neural Beamforming For Time And Frequency Domain Speech Separation (2022)11.29
- Improving Speaker Discrimination Of Target Speech Extraction With Time-domain Speakerbeam (2020)14.76
- Attention-based Neural Beamforming Layers For Multi-channel Speech Recognition (2021)0.00
- Embedding And Beamforming: All-neural Causal Beamformer For Multichannel Speech Enhancement (2021)13.05
- A Unified Multichannel Far-field Speech Recognition System: Combining Neural Beamforming With Attention Based End-to-end Model (2024)0.00