LibriMix
Emerging33papers using it
2022first seen
LibriMix is a dataset that contains mixed speech recordings of multiple talkers and is used to evaluate multi-talker automatic speech recognition (MT-ASR) systems.
Papers using LibriMix (30)
- EEND-SS: Joint End-to-end Neural Speaker Diarization And Speech Separation For Flexible Number Of SpeakersA Sidecar Separator Can Convert A Single-talker Speech Recognition System To A Multi-talker OneUnified Modeling Of Multi-talker Overlapped Speech Recognition And Diarization With A Sidecar SeparatorUnifying Diarization, Separation, and ASR with Multi-Speaker EncoderMultiple Choice Learning For Efficient Speech Separation With Many SpeakersDistilling LLM Semantic Priors into Encoder-Only Multi-Talker ASR with Talker-Count RoutingSLM-SS: Speech Language Model for Generative Speech SeparationAdapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech RecognitionSerialized Output Prompting for Large Language Model-based Multi-Talker Speech RecognitionCMT-LLM: Contextual Multi-Talker ASR Utilizing Large Language ModelsAn Investigation on Speaker Augmentation for End-to-End Speaker ExtractionUniversal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity DetectionEEND-DEMUX: End-to-end Neural Speaker Diarization Via Demultiplexed Speaker EmbeddingsHypothesis Clustering And Merging: Novel Multitalker Speech Recognition With Speaker TokensA Sidecar Separator Can Convert a Single-Talker Speech Recognition
System to a Multi-Talker OneNoise-Aware Speech Separation with Contrastive LearningUnified Modeling of Multi-Talker Overlapped Speech Recognition and
Diarization with a Sidecar SeparatorEEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed
Speaker EmbeddingsUSEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionIndividualized Conditioning and Negative Distances for Speaker
SeparationMonaural Multi-Speaker Speech Separation Using Efficient Transformer ModelUSED: Universal Speaker Extraction and DiarizationSelective HuBERT: Self-Supervised Pre-Training for Target Speaker in
Clean and Mixture SpeechTarget Speech Extraction with Pre-trained Self-supervised Learning
ModelsSerialized Output Training by Learned DominanceEmpowering Whisper as a Joint Multi-Talker and Target-Talker Speech
Recognition SystemAdvancing Multi-talker ASR Performance with Large Language ModelsSerialized Speech Information Guidance with Overlapped Encoding
Separation for Multi-Speaker Automatic Speech RecognitionHypothesis Clustering and Merging: Novel MultiTalker Speech Recognition
with Speaker TokensMultiple Choice Learning for Efficient Speech Separation with Many
Speakers