On The Importance Of Neural Wiener Filter For Resource Efficient Multichannel Speech Enhancement
2024 Β· Tsun-An Hsieh, Jacob Donley, Daniel Wong, et al.
Abstract
We introduce a time-domain framework for efficient multichannel speech enhancement, emphasizing low latency and computational efficiency. This framework incorporates two compact deep neural networks (DNNs) surrounding a multichannel neural Wiener filter (NWF). The first DNN enhances the speech signal to estimate NWF coefficients, while the second DNN refines the output from the NWF. The NWF, while conceptually similar to the traditional frequency-domain Wiener filter, undergoes a training process optimized for low-latency speech enhancement, involving fine-tuning of both analysis and synthesis transforms. Our research results illustrate that the NWF output, having minimal nonlinear distortions, attains performance levels akin to those of the first DNN, deviating from conventional Wiener filter paradigms. Training all components jointly outperforms sequential training, despite its simplicity. Consequently, this framework achieves superior performance with fewer parameters and reduced co
Authors
(none)
Tags
Stats
Related papers
- Decoupled Spatial And Temporal Processing For Resource Efficient Multichannel Speech Enhancement (2024)0.00
- Consistency-aware Multi-channel Speech Enhancement Using Deep Neural Networks (2020)0.00
- Insights Into Deep Non-linear Filters For Improved Multi-channel Speech Enhancement (2022)13.93
- On The Role Of Spatial, Spectral, And Temporal Processing For Dnn-based Non-linear Multi-channel Speech Enhancement (2022)7.81
- Dbnet: A Dual-branch Network Architecture Processing On Spectrum And Waveform For Single-channel Speech Enhancement (2021)8.09
- Narrow-band Deep Filtering For Multichannel Speech Enhancement (2019)0.00
- Lmfca-net: A Lightweight Model For Multi-channel Speech Enhancement With Efficient Narrow-band And Cross-band Attention (2025)3.58
- Multichannel Speech Enhancement Without Beamforming (2021)9.41