Online Neural Diarization Of Unlimited Numbers Of Speakers Using Global And Local Attractors
2022 Β· Shota Horiguchi, Shinji Watanabe, Paola Garcia, et al.
Abstract
A method to perform offline and online speaker diarization for an unlimited number of speakers is described in this paper. End-to-end neural diarization (EEND) has achieved overlap-aware speaker diarization by formulating it as a multi-label classification problem. It has also been extended for a flexible number of speakers by introducing speaker-wise attractors. However, the output number of speakers of attractor-based EEND is empirically capped; it cannot deal with cases where the number of speakers appearing during inference is higher than that during training because its speaker counting is trained in a fully supervised manner. Our method, EEND-GLA, solves this problem by introducing unsupervised clustering into attractor-based EEND. In the method, the input audio is first divided into short blocks, then attractor-based diarization is performed for each block, and finally, the results of each block are clustered on the basis of the similarity between locally-calculated attractors.
Authors
(none)
Tags
Stats
Related papers
- Towards Neural Diarization For Unlimited Numbers Of Speakers Using Global And Local Attractors (2021)11.29
- Encoder-decoder Based Attractors For End-to-end Neural Diarization (2021)13.05
- LS-EEND: Long-form Streaming End-to-end Neural Diarization With Online Attractor Extraction (2024)3.58
- BW-EDA-EEND: Streaming End-to-end Neural Speaker Diarization For A Variable Number Of Speakers (2020)10.74
- Online End-to-end Neural Diarization With Speaker-tracing Buffer (2020)10.74
- Speech-aware Neural Diarization With Encoder-decoder Attractor Guided By Attention Constraints (2024)0.00
- Frame-wise Streaming End-to-end Speaker Diarization With Non-autoregressive Self-attention-based Attractors (2023)2.26
- Speakers Unembedded: Embedding-free Approach To Long-form Neural Diarization (2024)3.58