Deep Self-supervised Hierarchical Clustering For Speaker Diarization
2020 Β· Prachi Singh, Sriram Ganapathy
Abstract
The state-of-the-art speaker diarization systems use agglomerative hierarchical clustering (AHC) which performs the clustering of previously learned neural embeddings. While the clustering approach attempts to identify speaker clusters, the AHC algorithm does not involve any further learning. In this paper, we propose a novel algorithm for hierarchical clustering which combines the speaker clustering along with a representation learning framework. The proposed approach is based on principles of self-supervised learning where the self-supervision is derived from the clustering algorithm. The representation learning network is trained with a regularized triplet loss using the clustering solution at the current step while the clustering algorithm uses the deep embeddings from the representation learning step. By combining the self-supervision based representation learning along with the clustering algorithm, we show that the proposed algorithm improves significantly 29% relative improveme
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization (2021)8.35
- End-to-end Supervised Hierarchical Graph Clustering For Speaker Diarization (2024)5.24
- Supervised Hierarchical Clustering Using Graph Neural Networks For Speaker Diarization (2023)0.00
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Learning Deep Representations By Multilayer Bootstrap Networks For Speaker Diarization (2019)0.00
- Discriminative Neural Clustering For Speaker Diarisation (2019)10.07
- Assessing The Robustness Of Spectral Clustering For Deep Speaker Diarization (2024)3.58
- Speaker Diarization Using Deep Recurrent Convolutional Neural Networks For Speaker Embeddings (2017)9.41