Taylorbeamformer: Learning All-neural Beamformer For Multi-channel Speech Enhancement From Taylor's Approximation Theory
2022 Β· Andong Li, Guochen Yu, Chengshi Zheng, et al.
Abstract
While existing end-to-end beamformers achieve impressive performance in various front-end speech processing tasks, they usually encapsulate the whole process into a black box and thus lack adequate interpretability. As an attempt to fill the blank, we propose a novel neural beamformer inspired by Taylor's approximation theory called TaylorBeamformer for multi-channel speech enhancement. The core idea is that the recovery process can be formulated as the spatial filtering in the neighborhood of the input mixture. Based on that, we decompose it into the superimposition of the 0th-order non-derivative and high-order derivative terms, where the former serves as the spatial filter and the latter is viewed as the residual noise canceller to further improve the speech quality. To enable end-to-end training, we replace the derivative operations with trainable networks and thus can learn from training data. Extensive experiments are conducted on the synthesized dataset based on LibriSpeech and
Authors
(none)
Tags
Stats
Related papers
- Taylorbeamixer: Learning Taylor-inspired All-neural Multi-channel Speech Enhancement From Beam-space Dictionary Perspective (2022)2.26
- Embedding And Beamforming: All-neural Causal Beamformer For Multichannel Speech Enhancement (2021)13.05
- Locate And Beamform: Two-dimensional Locating All-neural Beamformer For Multi-channel Speech Separation (2023)3.58
- Dual-path Transformer Based Neural Beamformer For Target Speech Extraction (2023)0.00
- A Unified Multichannel Far-field Speech Recognition System: Combining Neural Beamforming With Attention Based End-to-end Model (2024)0.00
- Speaker Adapted Beamforming For Multi-channel Automatic Speech Recognition (2018)5.84
- Sequential Multi-frame Neural Beamforming For Speech Separation And Enhancement (2019)0.00
- Attention-based Neural Beamforming Layers For Multi-channel Speech Recognition (2021)0.00