Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis And Rank-constrained Spatial Covariance Matrix Estimation
2024 Β· Yuto Ishikawa, Kohei Konaka, Tomohiko Nakamura, et al.
Abstract
Real-time speech extraction is an important challenge with various applications such as speech recognition in a human-like avatar/robot. In this paper, we propose the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME). The RCSCME-based method is a multichannel blind speech extraction method that demonstrates superior speech extraction performance in diffuse noise environments. To improve the performance, we introduce spatial regularization into the ILRMA part of the RCSCME-based speech extraction and design two regularizers. Speech extraction experiments demonstrated that the proposed methods can function in real time and the designed regularizers improve the speech extraction performance.
Authors
(none)
Tags
Stats
Related papers
- Low Rank And Sparsity Analysis Applied To Speech Enhancement Via Online Estimated Dictionary (2016)8.82
- Multi-channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation (2020)0.00
- Unsupervised Low Latency Speech Enhancement With RT-GCC-NMF (2019)9.59
- RIR-SF: Room Impulse Response Based Spatial Feature For Target Speech Recognition In Multi-channel Multi-speaker Scenarios (2023)0.00
- Target Speech Extraction Based On Blind Source Separation And X-vector-based Speaker Selection Trained With Data Augmentation (2020)0.00
- Time-domain Speech Extraction With Spatial Information And Multi Speaker Conditioning Mechanism (2021)7.81
- Robust Speaker Extraction Network Based On Iterative Refined Adaptation (2020)0.00
- Statistical Beamformer Exploiting Non-stationarity And Sparsity With Spatially Constrained ICA For Robust Speech Recognition (2023)0.00