Investigation Of End-to-end Speaker-attributed ASR For Continuous Multi-talker Recordings
2020 Β· Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, et al.
Abstract
Recently, an end-to-end (E2E) speaker-attributed automatic speech recognition (SA-ASR) model was proposed as a joint model of speaker counting, speech recognition and speaker identification for monaural overlapped speech. It showed promising results for simulated speech mixtures consisting of various numbers of speakers. However, the model required prior knowledge of speaker profiles to perform speaker identification, which significantly limited the application of the model. In this paper, we extend the prior work by addressing the case where no speaker profile is available. Specifically, we perform speaker counting and clustering by using the internal speaker representations of the E2E SA-ASR model to diarize the utterances of the speakers whose profiles are missing from the speaker inventory. We also propose a simple modification to the reference labels of the E2E SA-ASR training which helps handle continuous multi-talker recordings well. We conduct a comprehensive investigation of t
Authors
(none)
Tags
Stats
Related papers
- Hypothesis Stitcher For End-to-end Speaker-attributed ASR On Long-form Multi-talker Recordings (2021)5.24
- A Comparative Study Of Modular And Joint Approaches For Speaker-attributed ASR On Monaural Long-form Audio (2021)7.50
- Survey Of End-to-end Multi-speaker Automatic Speech Recognition For Monaural Audio (2025)2.26
- Joint Speaker Counting, Speech Recognition, And Speaker Identification For Overlapped Speech Of Any Number Of Speakers (2020)12.54
- Minimum Bayes Risk Training For End-to-end Speaker-attributed ASR (2020)0.00
- Transcribe-to-diarize: Neural Speaker Diarization For Unlimited Number Of Speakers Using End-to-end Speaker-attributed ASR (2021)11.49
- End-to-end Monaural Multi-speaker ASR System Without Pretraining (2018)11.93
- A Comparative Study On Speaker-attributed Automatic Speech Recognition In Multi-party Meetings (2022)8.09