Self-tuning Spectral Clustering For Speaker Diarization
2024 Β· Nikhil Raghav, Avisek Gupta, Md Sahidullah, et al.
Abstract
Spectral clustering has proven effective in grouping speech representations for speaker diarization tasks, although post-processing the affinity matrix remains difficult due to the need for careful tuning before constructing the Laplacian. In this study, we present a novel pruning algorithm to create a sparse affinity matrix called spectral clustering on p-neighborhood retained affinity matrix (SC-pNA). Our method improves on node-specific fixed neighbor selection by allowing a variable number of neighbors, eliminating the need for external tuning data as the pruning parameters are derived directly from the affinity matrix. SC-pNA does so by identifying two clusters in every row of the initial affinity matrix, and retains only the top p % similarity scores from the cluster containing larger similarities. Spectral clustering is performed subsequently, with the number of clusters determined as the maximum eigengap. Experimental results on the challenging DIHARD-III dataset highlight the
Authors
(none)
Tags
Stats
Related papers
- Auto-tuning Spectral Clustering For Speaker Diarization Using Normalized Maximum Eigengap (2020)14.58
- Assessing The Robustness Of Spectral Clustering For Deep Speaker Diarization (2024)3.58
- Spectral Clustering-aware Learning Of Embeddings For Speaker Diarisation (2022)2.26
- Multi-class Spectral Clustering With Overlaps For Speaker Diarization (2020)10.35
- LSTM Based Similarity Measurement With Spectral Clustering For Speaker Diarization (2019)13.79
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Deep Self-supervised Hierarchical Clustering For Speaker Diarization (2020)5.24
- Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization (2021)8.35