The Hitachi-jhu DIHARD III System: Competitive End-to-end Neural Diarization And X-vector Clustering Systems Combined By Dover-lap
2021 Β· Shota Horiguchi, Nelson Yalta, Paola Garcia, et al.
Abstract
This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge. The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem. We refine each system and all five subsystems become competitive and complementary. After the DOVER-Lap based system combination, it achieved diarization error rates of 11.58 % and 14.09 % in Track 1 full and core, and 16.94 % and 20.01 % in Track 2 full and core, respectively. With their results, we won second place in all the tasks of the challenge.
Authors
(none)
Tags
Stats
Related papers
- The Dku-duke-lenovo System Description For The Third DIHARD Speech Diarization Challenge (2021)0.00
- DIHARD II Is Still Hard: Experimental Results And Discussions From The DKU-LENOVO Team (2020)6.34
- UWB-NTIS Speaker Diarization System For The DIHARD II 2019 Challenge (2019)4.52
- BUT System Description For DIHARD Speech Diarization Challenge 2019 (2019)0.00
- NTT Speaker Diarization System For Chime-7: Multi-domain, Multi-microphone End-to-end And Vector Clustering Diarization (2023)7.16
- The Speed Submission To DIHARD II: Contributions & Lessons Learned (2019)0.00
- The HUAWEI Speaker Diarisation System For The Voxceleb Speaker Diarisation Challenge (2020)0.00
- The DKU-MSXF Diarization System For The Voxceleb Speaker Recognition Challenge 2023 (2023)5.24