β all papers Β· overview
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant
Multi-Talker Speech Separation, ASR and Speaker Diarization
Abstract
The evolving speech processing landscape is increasingly focused on complex
scenarios like meetings or cocktail parties with multiple simultaneous speakers
and far-field conditions. Existing methodologies for addressing these
challenges fall into two categories: multi-channel and single-channel
solutions. Single-channel approaches, notable for their generality and
convenience, do not require specific information about microphone arrays.
This paper presents a large-scale far-field overlapping speech dataset,
crafted to advance research in speech separation, recognition, and speaker
diarization. This dataset is a critical resource for decoding ``Who said What
and When'' in multi-talker, reverberant environments, a daunting challenge in
the field. Additionally, we introduce a pipeline system encompassing speech
separation, recognition, and diarization as a foundational benchmark.
Evaluations on the WHAMR! dataset validate the broad applicability of the
proposed data.
Related papers
- Libriheavymix: A 20,000-hour Dataset For Single-channel Reverberant Multi-talker Speech Separation, ASR And Speaker Diarization (2024)5.2
- WHAMR!: Noisy And Reverberant Single-channel Speech Separation (2019)16.1
- RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation (2026)β
- SMS-WSJ: Database, Performance Measures, And Baseline Recipe For Multi-channel Source Separation And Recognition (2019)0.0
- Building Corpora for Single-Channel Speech Separation Across Multiple
Domains (2024)β
- Treble10: A high-quality dataset for far-field speech recognition, dereverberation, and enhancement (2025)β
- Integration Of Speech Separation, Diarization, And Recognition For Multi-speaker Meetings: System Description, Comparison, And Analysis (2020)13.2
- Time-domain Speech Extraction With Spatial Information And Multi Speaker Conditioning Mechanism (2021)7.8