Low-latency Online Speaker Diarization With Graph-based Label Generation
2021 Β· Yucong Zhang, Qinjian Lin, Weiqing Wang, et al.
Abstract
This paper introduces an online speaker diarization system that can handle long-time audio with low latency. We enable Agglomerative Hierarchy Clustering (AHC) to work in an online fashion by introducing a label matching algorithm. This algorithm solves the inconsistency between output labels and hidden labels that are generated each turn. To ensure the low latency in the online setting, we introduce a variant of AHC, namely chkpt-AHC, to cluster the speakers. In addition, we propose a speaker embedding graph to exploit a graph-based re-clustering method, further improving the performance. In the experiment, we evaluate our systems on both DIHARD3 and VoxConverse datasets. The experimental results show that our proposed online systems have better performance than our baseline online system and have comparable performance to our offline systems. We find out that the framework combining the chkpt-AHC method and the label matching algorithm works well in the online setting. Moreover, the
Authors
(none)
Tags
Stats
Related papers
- Overlap-aware Low-latency Online Speaker Diarization Based On End-to-end Local Segmentation (2021)10.35
- A Reinforcement Learning Framework For Online Speaker Diarization (2023)0.00
- Systematic Evaluation Of Online Speaker Diarization Systems Regarding Their Latency (2024)0.00
- Deep Self-supervised Hierarchical Clustering For Speaker Diarization (2020)5.24
- Highly Efficient Real-time Streaming And Fully On-device Speaker Diarization With Multi-stage Clustering (2022)0.00
- Speaker Diarization Using Two-pass Leave-one-out Gaussian PLDA Clustering Of DNN Embeddings (2021)2.26
- End-to-end Speaker Diarization As Post-processing (2020)11.08
- Absolute Decision Corrupts Absolutely: Conservative Online Speaker Diarisation (2022)0.00