Multi-channel Multi-frame ADL-MVDR For Target Speech Separation
2020 · Zhuohuang Zhang, Yong Xu, Meng Yu, et al.
Abstract
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters are often adopted to remove nonlinear distortions, however, conventional neural mask-based MVDR systems still result in relatively high levels of residual noise. Moreover, the matrix inverse involved in the MVDR solution is sometimes numerically unstable during joint training with neural networks. In this study, we propose a multi-channel multi-frame (MCMF) all deep learning (ADL)-MVDR approach for target speech separation, which extends our preliminary multi-channel ADL-MVDR approach. The proposed MCMF ADL-MVDR system addresses linear and nonlinear distortions. Spatio-temporal cross correlations are also fully utilized in the proposed approach. The proposed systems are evaluated using a Mandarin
Authors
(none)
Tags
Stats
Related papers
- ADL-MVDR: All Deep Learning MVDR Beamformer For Target Speech Separation (2020)15.00
- Deep Multi-frame MVDR Filtering For Single-microphone Speech Enhancement (2020)9.03
- Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation In Complex Domain (2021)11.85
- Audio-visual Speech Separation And Dereverberation With A Two-stage Multimodal Network (2019)12.47
- Unsupervised Speech Enhancement Based On Multichannel Nmf-informed Beamforming For Noise-robust Automatic Speech Recognition (2019)13.23
- Multichannel Singing Voice Separation By Deep Neural Network Informed DOA Constrained CNMF (2020)5.84
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- Audio-visual Multi-channel Speech Separation, Dereverberation And Recognition (2022)6.77