Multi-class Spectral Clustering With Overlaps For Speaker Diarization
2020 Β· Desh Raj, Zili Huang, Sanjeev Khudanpur
Abstract
This paper describes a method for overlap-aware speaker diarization. Given an overlap detector and a speaker embedding extractor, our method performs spectral clustering of segments informed by the output of the overlap detector. This is achieved by transforming the discrete clustering problem into a convex optimization problem which is solved by eigen-decomposition. Thereafter, we discretize the solution by alternatively using singular value decomposition and a modified version of non-maximal suppression which is constrained by the output of the overlap detector. Furthermore, we detail an HMM-DNN based overlap detector which performs frame-level classification and enforces duration constraints through HMM state transitions. Our method achieves a test diarization error rate (DER) of 24.0% on the mixed-headset setting of the AMI meeting corpus, which is a relative improvement of 15.2% over a strong agglomerative hierarchical clustering baseline, and compares favorably with other overlap
Authors
(none)
Tags
Stats
Related papers
- Overlap-aware Diarization: Resegmentation Using Neural End-to-end Overlapped Speech Detection (2019)13.17
- End-to-end Speaker Diarization As Post-processing (2020)11.08
- Geodesic Interpolation Of Frame-wise Speaker Embeddings For The Diarization Of Meeting Scenarios (2024)5.24
- Assessing The Robustness Of Spectral Clustering For Deep Speaker Diarization (2024)3.58
- Overlap-aware Low-latency Online Speaker Diarization Based On End-to-end Local Segmentation (2021)10.35
- Speaker Embedding-aware Neural Diarization: An Efficient Framework For Overlapping Speech Diarization In Meeting Scenarios (2022)0.00
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Compositional Embedding Models For Speaker Identification And Diarization With Simultaneous Speech From 2+ Speakers (2020)3.58