Deep Neural Mel-subband Beamformer For In-car Speech Separation
2022 Β· Vinay Kothapally, Yong Xu, Meng Yu, et al.
Abstract
While current deep learning (DL)-based beamforming techniques have been proved effective in speech separation, they are often designed to process narrow-band (NB) frequencies independently which results in higher computational costs and inference times, making them unsuitable for real-world use. In this paper, we propose DL-based mel-subband spatio-temporal beamformer to perform speech separation in a car environment with reduced computation cost and inference time. As opposed to conventional subband (SB) approaches, our framework uses a mel-scale based subband selection strategy which ensures a fine-grained processing for lower frequencies where most speech formant structure is present, and coarse-grained processing for higher frequencies. In a recursive way, robust frame-level beamforming weights are determined for each speaker location/zone in a car from the estimated subband speech and noise covariance matrices. Furthermore, proposed framework also estimates and suppresses any echo
Authors
(none)
Tags
Stats
Related papers
- Sequential Multi-frame Neural Beamforming For Speech Separation And Enhancement (2019)0.00
- Dualsep: A Light-weight Dual-encoder Convolutional Recurrent Network For Real-time In-car Speech Separation (2024)0.00
- ADL-MVDR: All Deep Learning MVDR Beamformer For Target Speech Separation (2020)15.00
- Short-time Deep-learning Based Source Separation For Speech Enhancement In Reverberant Environments With Beamforming (2020)0.00
- Deep Ad-hoc Beamforming Based On Speaker Extraction For Target-dependent Speech Separation (2020)7.50
- Dual-path Transformer Based Neural Beamformer For Target Speech Extraction (2023)0.00
- Locate And Beamform: Two-dimensional Locating All-neural Beamformer For Multi-channel Speech Separation (2023)3.58
- DBNET: Doa-driven Beamforming Network For End-to-end Farfield Sound Source Separation (2020)0.00