Exploring Speaker-related Information In Spoken Language Understanding For Better Speaker Diarization
2023 Β· Luyao Cheng, Siqi Zheng, Zhang Qinglin, et al.
Abstract
Speaker diarization(SD) is a classic task in speech processing and is crucial in multi-party scenarios such as meetings and conversations. Current mainstream speaker diarization approaches consider acoustic information only, which result in performance degradation when encountering adverse acoustic conditions. In this paper, we propose methods to extract speaker-related information from semantic content in multi-party meetings, which, as we will show, can further benefit speaker diarization. We introduce two sub-tasks, Dialogue Detection and Speaker-Turn Detection, in which we effectively extract speaker information from conversational semantics. We also propose a simple yet effective algorithm to jointly model acoustic and semantic information and obtain speaker-identified texts. Experiments on both AISHELL-4 and AliMeeting datasets show that our method achieves consistent improvements over acoustic-only speaker diarization systems.
Authors
(none)
Tags
Stats
Related papers
- Integrating Audio, Visual, And Semantic Information For Enhanced Multimodal Speaker Diarization (2024)0.00
- Improving Speaker Diarization Using Semantic Information: Joint Pairwise Constraints Propagation (2023)0.00
- USED: Universal Speaker Extraction And Diarization (2023)7.50
- Speaker Diarization With Lexical Information (2018)9.76
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Novel Architectures For Unsupervised Information Bottleneck Based Speaker Diarization Of Meetings (2020)8.09
- Robust Acoustic Domain Identification With Its Application To Speaker Diarization (2022)2.26
- Target-speaker Voice Activity Detection: A Novel Approach For Multi-speaker Diarization In A Dinner Party Scenario (2020)16.19