Discriminative Neural Clustering For Speaker Diarisation
2019 Β· Qiujia Li, Florian L. Kreyssig, Chao Zhang, et al.
Abstract
In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data clustering with a maximum number of clusters as a supervised sequence-to-sequence learning problem. Compared to traditional unsupervised clustering algorithms, DNC learns clustering patterns from training data without requiring an explicit definition of a similarity measure. An implementation of DNC based on the Transformer architecture is shown to be effective on a speaker diarisation task using the challenging AMI dataset. Since AMI contains only 147 complete meetings as individual input sequences, data scarcity is a significant issue for training a Transformer model for DNC. Accordingly, this paper proposes three data augmentation schemes: sub-sequence randomisation, input vector randomisation, and Diaconis augmentation, which generates new data samples by rotating the entire input sequence of L2-normalised speaker embeddings. Experimental results on AMI show that DNC achieves a reduction in speake
Authors
(none)
Tags
Stats
Related papers
- Deep Self-supervised Hierarchical Clustering For Speaker Diarization (2020)5.24
- Meta-learning With Latent Space Clustering In Generative Adversarial Network For Speaker Diarization (2020)9.03
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Advances In Integration Of End-to-end Neural And Clustering-based Diarization For Real Conversational Speech (2021)16.48
- Spectral Clustering-aware Learning Of Embeddings For Speaker Diarisation (2022)2.26
- Turn-to-diarize: Online Speaker Diarization Constrained By Transformer Transducer Speaker Turn Detection (2021)12.40
- Learning Deep Representations By Multilayer Bootstrap Networks For Speaker Diarization (2019)0.00
- Interrelate Training And Searching: A Unified Online Clustering Framework For Speaker Diarization (2022)6.77