An Initialization Scheme For Meeting Separation With Spatial Mixture Models
2022 Β· Christoph Boeddeker, Tobias Cord-Landwehr, Thilo von Neumann, et al.
Abstract
Spatial mixture model (SMM) supported acoustic beamforming has been extensively used for the separation of simultaneously active speakers. However, it has hardly been considered for the separation of meeting data, that are characterized by long recordings and only partially overlapping speech. In this contribution, we show that the fact that often only a single speaker is active can be utilized for a clever initialization of an SMM that employs time-varying class priors. In experiments on LibriCSS we show that the proposed initialization scheme achieves a significantly lower Word Error Rate (WER) on a downstream speech recognition task than a random initialization of the class probabilities by drawing from a Dirichlet distribution. With the only requirement that the number of speakers has to be known, we obtain a WER of 5.9 %, which is comparable to the best reported WER on this data set. Furthermore, the estimated speaker activity from the mixture model serves as a diarization based o
Authors
(none)
Tags
Stats
Related papers
- Simultaneous Diarization And Separation Of Meetings Through The Integration Of Statistical Mixture Models (2024)0.00
- Handling Trade-offs In Speech Separation With Sparsely-gated Mixture Of Experts (2022)0.00
- TS-SEP: Joint Diarization And Separation Conditioned On Estimated Speaker Embeddings (2023)10.35
- Continuous Speech Separation Using Speaker Inventory For Long Multi-talker Recording (2020)7.50
- Multi-talker MVDR Beamforming Based On Extended Complex Gaussian Mixture Model (2019)0.00
- Recursive Speech Separation For Unknown Number Of Speakers (2019)12.93
- Target Speaker Selection For Neural Network Beamforming In Multi-speaker Scenarios (2025)0.00
- Multiple Choice Learning For Efficient Speech Separation With Many Speakers (2024)2.26