Probabilistic Fusion And Calibration Of Neural Speaker Diarization Models
2025 Β· Juan Ignacio Alvarez-Trejos, Sergio A. Balanya, Daniel Ramos, et al.
Abstract
End-to-End Neural Diarization (EEND) systems produce frame-level probabilistic speaker activity estimates, yet since evaluation focuses primarily on Diarization Error Rate (DER), the reliability and calibration of these confidence scores have been largely neglected. When fusing multiple diarization systems, DOVER-Lap remains the only established approach, operating at the segment level with hard decisions. We propose working with continuous probability outputs, which enables more sophisticated fusion and calibration techniques that can leverage model uncertainty and complementary strengths across different architectures. This paper presents the first comprehensive framework for calibrating and fusing EEND models at the probability level. We investigate two output formulations (multilabel and powerset representations) and their impact on calibration and fusion effectiveness. Through extensive experiments on the CallHome two-speaker benchmark, we demonstrate that proper calibration provi
Authors
(none)
Tags
Stats
Related papers
- EEND-SS: Joint End-to-end Neural Speaker Diarization And Speech Separation For Flexible Number Of Speakers (2022)10.35
- End-to-end Neural Diarization: Reformulating Speaker Diarization As Simple Multi-label Classification (2020)0.00
- Multi-channel End-to-end Neural Diarization With Distributed Microphones (2021)10.21
- EEND-DEMUX: End-to-end Neural Speaker Diarization Via Demultiplexed Speaker Embeddings (2023)0.00
- Advances In Integration Of End-to-end Neural And Clustering-based Diarization For Real Conversational Speech (2021)16.48
- Towards Word-level End-to-end Neural Speaker Diarization With Auxiliary Network (2023)0.00
- Multi-scale Speaker Diarization With Neural Affinity Score Fusion (2020)6.77
- End-to-end Speaker Diarization Conditioned On Speech Activity And Overlap Detection (2021)8.82