Multi-channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation
2020 Β· Jiangyu Han, Xinyuan Zhou, Yanhua Long, et al.
Abstract
The end-to-end approaches for single-channel target speech extraction have attracted widespread attention. However, the studies for end-to-end multi-channel target speech extraction are still relatively limited. In this work, we propose two methods for exploiting the multi-channel spatial information to extract the target speech. The first one is using a target speech adaptation layer in a parallel encoder architecture. The second one is designing a channel decorrelation mechanism to extract the inter-channel differential information to enhance the multi-channel encoder representation. We compare the proposed methods with two strong state-of-the-art baselines. Experimental results on the multi-channel reverberant WSJ0 2-mix dataset demonstrate that our proposed methods achieve up to 11.2% and 11.5% relative improvements in SDR and SiSDR respectively, which are the best reported results on this task to the best of our knowledge.
Authors
(none)
Tags
Stats
Related papers
- Improving Channel Decorrelation For Multi-channel Target Speech Extraction (2021)6.34
- Time-domain Speech Extraction With Spatial Information And Multi Speaker Conditioning Mechanism (2021)7.81
- End-to-end Dereverberation, Beamforming, And Speech Recognition With Improved Numerical Stability And Advanced Frontend (2021)10.97
- Multi-channel Speaker Verification For Single And Multi-talker Speech (2020)0.00
- End-to-end Multi-channel Speaker Extraction And Binaural Speech Synthesis (2024)0.00
- Exploiting Single-channel Speech For Multi-channel End-to-end Speech Recognition (2021)0.00
- End-to-end Multichannel Speaker-attributed ASR: Speaker Guided Decoder And Input Feature Analysis (2023)0.00
- End-to-end Multi-channel Speech Separation (2019)0.00