Insights Into Deep Non-linear Filters For Improved Multi-channel Speech Enhancement
2022 Β· Kristina Tesch, Timo Gerkmann
Abstract
The key advantage of using multiple microphones for speech enhancement is that spatial filtering can be used to complement the tempo-spectral processing. In a traditional setting, linear spatial filtering (beamforming) and single-channel post-filtering are commonly performed separately. In contrast, there is a trend towards employing deep neural networks (DNNs) to learn a joint spatial and tempo-spectral non-linear filter, which means that the restriction of a linear processing model and that of a separate processing of spatial and tempo-spectral information can potentially be overcome. However, the internal mechanisms that lead to good performance of such data-driven filters for multi-channel speech enhancement are not well understood. Therefore, in this work, we analyse the properties of a non-linear spatial filter realized by a DNN as well as its interdependency with temporal and spectral processing by carefully controlling the information sources (spatial, spectral, and temporal) a
Authors
(none)
Tags
Stats
Related papers
- On The Role Of Spatial, Spectral, And Temporal Processing For Dnn-based Non-linear Multi-channel Speech Enhancement (2022)7.81
- Consistency-aware Multi-channel Speech Enhancement Using Deep Neural Networks (2020)0.00
- Multichannel Speech Enhancement Without Beamforming (2021)9.41
- Decoupled Spatial And Temporal Processing For Resource Efficient Multichannel Speech Enhancement (2024)0.00
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- On The Importance Of Neural Wiener Filter For Resource Efficient Multichannel Speech Enhancement (2024)0.00
- Efficient Multi-channel Speech Enhancement With Spherical Harmonics Injection For Directional Encoding (2023)3.58
- Multi-channel End-to-end Neural Network For Speech Enhancement, Source Localization, And Voice Activity Detection (2022)0.00