Improving Speaker Diarization Using Semantic Information: Joint Pairwise Constraints Propagation
2023 Β· Luyao Cheng, Siqi Zheng, Qinglin Zhang, et al.
Abstract
Speaker diarization has gained considerable attention within speech processing research community. Mainstream speaker diarization rely primarily on speakers' voice characteristics extracted from acoustic signals and often overlook the potential of semantic information. Considering the fact that speech signals can efficiently convey the content of a speech, it is of our interest to fully exploit these semantic cues utilizing language models. In this work we propose a novel approach to effectively leverage semantic information in clustering-based speaker diarization systems. Firstly, we introduce spoken language understanding modules to extract speaker-related semantic information and utilize these information to construct pairwise constraints. Secondly, we present a novel framework to integrate these constraints into the speaker diarization pipeline, enhancing the performance of the entire system. Extensive experiments conducted on the public dataset demonstrate the consistent superiori
Authors
(none)
Tags
Stats
Related papers
- Integrating Audio, Visual, And Semantic Information For Enhanced Multimodal Speaker Diarization (2024)0.00
- Exploring Speaker-related Information In Spoken Language Understanding For Better Speaker Diarization (2023)0.00
- Speaker Diarization With Lexical Information (2018)9.76
- Joint Training Of Speaker Embedding Extractor, Speech And Overlap Detection For Diarization (2024)2.26
- End-to-end Speaker Diarization As Post-processing (2020)11.08
- Enhancements For Audio-only Diarization Systems (2019)0.00
- Leveraging Speaker Embeddings In End-to-end Neural Diarization For Two-speaker Scenarios (2024)0.00
- Multimodal Speaker Segmentation And Diarization Using Lexical And Acoustic Cues Via Sequence To Sequence Neural Networks (2018)9.92