Absolute Decision Corrupts Absolutely: Conservative Online Speaker Diarisation
2022 Β· Youngki Kwon, Hee-Soo Heo, Bong-Jin Lee, et al.
Abstract
Our focus lies in developing an online speaker diarisation framework which demonstrates robust performance across diverse domains. In online speaker diarisation, outputs generated in real-time are irreversible, and a few misjudgements in the early phase of an input session can lead to catastrophic results. We hypothesise that cautiously increasing the number of estimated speakers is of paramount importance among many other factors. Thus, our proposed framework includes decreasing the number of speakers by one when the system judges that an increase in the past was faulty. We also adopt dual buffers, checkpoints and centroids, where checkpoints are combined with silhouette coefficients to estimate the number of speakers and centroids represent speakers. Again, we believe that more than one centroid can be generated from one speaker. Thus we design a clustering-based label matching technique to assign labels in real-time. The resulting system is lightweight yet surprisingly effective. Th
Authors
(none)
Tags
Stats
Related papers
- Low-latency Online Speaker Diarization With Graph-based Label Generation (2021)8.60
- Overlap-aware Low-latency Online Speaker Diarization Based On End-to-end Local Segmentation (2021)10.35
- A Reinforcement Learning Framework For Online Speaker Diarization (2023)0.00
- Spot The Conversation: Speaker Diarisation In The Wild (2020)15.31
- Highly Efficient Real-time Streaming And Fully On-device Speaker Diarization With Multi-stage Clustering (2022)0.00
- End-to-end Speaker Diarization As Post-processing (2020)11.08
- Diacorrect: End-to-end Error Correction For Speaker Diarization (2022)0.00
- Scdiar: A Streaming Diarization System Based On Speaker Change Detection And Speech Recognition (2025)2.26