Interrelate Training And Searching: A Unified Online Clustering Framework For Speaker Diarization
2022 Β· Yifan Chen, Yifan Guo, Qingxuan Li, et al.
Abstract
For online speaker diarization, samples arrive incrementally, and the overall distribution of the samples is invisible. Moreover, in most existing clustering-based methods, the training objective of the embedding extractor is not designed specially for clustering. To improve online speaker diarization performance, we propose a unified online clustering framework, which provides an interactive manner between embedding extractors and clustering algorithms. Specifically, the framework consists of two highly coupled parts: clustering-guided recurrent training (CGRT) and truncated beam searching clustering (TBSC). The CGRT introduces the clustering algorithm into the training process of embedding extractors, which could provide not only cluster-aware information for the embedding extractor, but also crucial parameters for the clustering process afterward. And with these parameters, which contain preliminary information of the metric space, the TBSC penalizes the probability score of each cl
Authors
(none)
Tags
Stats
Related papers
- Joint Training Of Speaker Embedding Extractor, Speech And Overlap Detection For Diarization (2024)2.26
- Geodesic Interpolation Of Frame-wise Speaker Embeddings For The Diarization Of Meeting Scenarios (2024)5.24
- A Reinforcement Learning Framework For Online Speaker Diarization (2023)0.00
- Enhancements For Audio-only Diarization Systems (2019)0.00
- End-to-end Speaker Diarization As Post-processing (2020)11.08
- Highly Efficient Real-time Streaming And Fully On-device Speaker Diarization With Multi-stage Clustering (2022)0.00
- Deep Self-supervised Hierarchical Clustering For Speaker Diarization (2020)5.24
- Reformulating Speaker Diarization As Community Detection With Emphasis On Topological Structure (2022)5.84