Exploring Detection-based Method For Speaker Diarization @ Ego4d Audio-only Diarization Challenge 2022
2022 Β· Jiahao Wang, Guo Chen, Yin-Dong Zheng, et al.
Abstract
We provide the technical report for Ego4D audio-only diarization challenge in ECCV 2022. Speaker diarization takes the audio streams as input and outputs the homogeneous segments according to the speaker's identity. It aims to solve the problem of "Who spoke when." In this paper, we explore a Detection-based method to tackle the audio-only speaker diarization task. Our method first extracts audio features by audio backbone and then feeds the feature to a detection-generate network to get the speaker proposals. Finally, after postprocessing, we can get the diarization results. The validation dataset validates this method, and our method achieves 53.85 DER on the test dataset. These results rank 3rd on the leaderboard of Ego4D audio-only diarization challenge 2022.
Authors
(none)
Tags
Stats
Related papers
- The HUAWEI Speaker Diarisation System For The Voxceleb Speaker Diarisation Challenge (2020)0.00
- STHG: Spatial-temporal Heterogeneous Graph Learning For Advanced Audio-visual Diarization (2023)0.00
- Microsoft Speaker Diarization System For The Voxceleb Speaker Recognition Challenge 2020 (2020)11.93
- Spot The Conversation: Speaker Diarisation In The Wild (2020)15.31
- The DKU-MSXF Diarization System For The Voxceleb Speaker Recognition Challenge 2023 (2023)5.24
- Robust Acoustic Domain Identification With Its Application To Speaker Diarization (2022)2.26
- The BUCEA Speaker Diarization System For The Voxceleb Speaker Recognition Challenge 2022 (2022)0.00
- Exploring Speaker-related Information In Spoken Language Understanding For Better Speaker Diarization (2023)0.00