Improving Channel Decorrelation For Multi-channel Target Speech Extraction
2021 Β· Jiangyu Han, Wei Rao, Yannan Wang, et al.
Abstract
Target speech extraction has attracted widespread attention. When microphone arrays are available, the additional spatial information can be helpful in extracting the target speech. We have recently proposed a channel decorrelation (CD) mechanism to extract the inter-channel differential information to enhance the reference channel encoder representation. Although the proposed mechanism has shown promising results for extracting the target speech from mixtures, the extraction performance is still limited by the nature of the original decorrelation theory. In this paper, we propose two methods to broaden the horizon of the original channel decorrelation, by replacing the original softmax-based inter-channel similarity between encoder representations, using an unrolled probability and a normalized cosine-based similarity at the dimensional-level. Moreover, new combination strategies of the CD-based spatial information and target speaker adaptation of parallel encoder outputs are also inv
Authors
(none)
Tags
Stats
Related papers
- Multi-channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation (2020)0.00
- Time-domain Speech Extraction With Spatial Information And Multi Speaker Conditioning Mechanism (2021)7.81
- Multi-channel Speaker Verification For Single And Multi-talker Speech (2020)0.00
- Dualstream Contextual Fusion Network: Efficient Target Speaker Extraction By Leveraging Mixture And Enrollment Interactions (2025)0.00
- Distortionless Multi-channel Target Speech Enhancement For Overlapped Speech Recognition (2020)0.00
- Exploring The Potential Of Data-driven Spatial Audio Enhancement Using A Single-channel Model (2024)0.00
- Speaker Reinforcement Using Target Source Extraction For Robust Automatic Speech Recognition (2022)7.50
- Spatial-dccrn: Dccrn Equipped With Frame-level Angle Feature And Hybrid Filtering For Multi-channel Speech Enhancement (2022)5.84