Libriheavymix: A 20,000-hour Dataset For Single-channel Reverberant Multi-talker Speech Separation, ASR And Speaker Diarization
2024 Β· Zengrui Jin, Yifan Yang, Mohan Shi, et al.
Abstract
The evolving speech processing landscape is increasingly focused on complex scenarios like meetings or cocktail parties with multiple simultaneous speakers and far-field conditions. Existing methodologies for addressing these challenges fall into two categories: multi-channel and single-channel solutions. Single-channel approaches, notable for their generality and convenience, do not require specific information about microphone arrays. This paper presents a large-scale far-field overlapping speech dataset, crafted to advance research in speech separation, recognition, and speaker diarization. This dataset is a critical resource for decoding ``Who said What and When'' in multi-talker, reverberant environments, a daunting challenge in the field. Additionally, we introduce a pipeline system encompassing speech separation, recognition, and diarization as a foundational benchmark. Evaluations on the WHAMR! dataset validate the broad applicability of the proposed data.
Authors
(none)
Tags
Stats
Related papers
- WHAMR!: Noisy And Reverberant Single-channel Speech Separation (2019)16.10
- Integration Of Speech Separation, Diarization, And Recognition For Multi-speaker Meetings: System Description, Comparison, And Analysis (2020)13.23
- SMS-WSJ: Database, Performance Measures, And Baseline Recipe For Multi-channel Source Separation And Recognition (2019)0.00
- Time-domain Speech Extraction With Spatial Information And Multi Speaker Conditioning Mechanism (2021)7.81
- Unified Modeling Of Multi-talker Overlapped Speech Recognition And Diarization With A Sidecar Separator (2023)7.50
- End-to-end Dereverberation, Beamforming, And Speech Recognition With Improved Numerical Stability And Advanced Frontend (2021)10.97
- Elevating Robust Multi-talker ASR By Decoupling Speaker Separation And Speech Recognition (2025)0.00
- A Two-stage Speaker Extraction Algorithm Under Adverse Acoustic Conditions Using A Single-microphone (2023)0.00