Speaker Diarization With Lexical Information
2018 Β· Tae Jin Park, Kyu Han, Ian Lane, et al.
Abstract
This work presents a novel approach to leverage lexical information for speaker diarization. We introduce a speaker diarization system that can directly integrate lexical as well as acoustic information into a speaker clustering process. Thus, we propose an adjacency matrix integration technique to integrate word level speaker turn probabilities with speaker embeddings in a comprehensive way. Our proposed method works without any reference transcript. Words, and word boundary information are provided by an ASR system. We show that our proposed method improves a baseline speaker diarization system solely based on speaker embeddings, achieving a meaningful improvement on the CALLHOME American English Speech dataset.
Authors
(none)
Tags
Stats
Related papers
- Multimodal Speaker Segmentation And Diarization Using Lexical And Acoustic Cues Via Sequence To Sequence Neural Networks (2018)9.92
- Exploring Speaker-related Information In Spoken Language Understanding For Better Speaker Diarization (2023)0.00
- Leveraging Speaker Embeddings In End-to-end Neural Diarization For Two-speaker Scenarios (2024)0.00
- Lexical Speaker Error Correction: Leveraging Language Models For Speaker Diarization Error Correction (2023)0.00
- Speaker Diarization With LSTM (2017)17.48
- Enhancing Speaker Diarization With Large Language Models: A Contextual Beam Search Approach (2023)7.50
- Speaker Diarization Using Two-pass Leave-one-out Gaussian PLDA Clustering Of DNN Embeddings (2021)2.26
- Integrating Audio, Visual, And Semantic Information For Enhanced Multimodal Speaker Diarization (2024)0.00