On The Role Of Spatial, Spectral, And Temporal Processing For Dnn-based Non-linear Multi-channel Speech Enhancement
2022 Β· Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann
Abstract
Employing deep neural networks (DNNs) to directly learn filters for multi-channel speech enhancement has potentially two key advantages over a traditional approach combining a linear spatial filter with an independent tempo-spectral post-filter: 1) non-linear spatial filtering allows to overcome potential restrictions originating from a linear processing model and 2) joint processing of spatial and tempo-spectral information allows to exploit interdependencies between different sources of information. A variety of DNN-based non-linear filters have been proposed recently, for which good enhancement performance is reported. However, little is known about the internal mechanisms which turns network architecture design into a game of chance. Therefore, in this paper, we perform experiments to better understand the internal processing of spatial, spectral and temporal information by DNN-based non-linear filters. On the one hand, our experiments in a difficult speech extraction scenario conf
Authors
(none)
Tags
Stats
Related papers
- Insights Into Deep Non-linear Filters For Improved Multi-channel Speech Enhancement (2022)13.93
- Consistency-aware Multi-channel Speech Enhancement Using Deep Neural Networks (2020)0.00
- Decoupled Spatial And Temporal Processing For Resource Efficient Multichannel Speech Enhancement (2024)0.00
- Multichannel Speech Enhancement Without Beamforming (2021)9.41
- Multi-modal Hybrid Deep Neural Network For Speech Enhancement (2016)0.00
- How To Leverage Dnn-based Speech Enhancement For Multi-channel Speaker Verification? (2022)0.00
- Deep Neural Network Techniques For Monaural Speech Enhancement: State Of The Art Analysis (2022)0.00
- Spatial-dccrn: Dccrn Equipped With Frame-level Angle Feature And Hybrid Filtering For Multi-channel Speech Enhancement (2022)5.84