← all datasets

AMI

Emerging

43papers using it

2021first seen

The AMI Meeting Corpus consists of 100 hours of meeting recordings. The recordings use a range of signals synchronized to a common timeline. These include close-talking and far-field microphones, individual and room-view video cameras, and output from a slide projector and an electronic whiteboard. During the meetings,

🔎 Find this dataset

Papers using AMI (43)

Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams2025 · 2 cites

SoulX-Transcriber: A Robust End-to-End Framework for Multi-Speaker Speech Transcription2026

Fast and Robust On-Device Speaker Diarization: Relative Minimum Cluster Size for Stride-Accelerated Pipelines2026

Non-Autoregressive Minimum Bayes' Risk Decoding for Fast Speech Recognition2026

Grounding Spoken LLMs in Multi-Speaker Audio via Diarization Conditioning2026

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update2026

BiRQ: Bi-Level Self-Labeling Random Quantization for Self-Supervised Speech Recognition2025

Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM2025

A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport2025

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition2021 · 76 cites

Injecting Text in Self-Supervised Speech Pretraining2021 · 25 cites

Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization2023 · 12 cites

Advancing Multi-talker ASR Performance with Large Language Models2024 · 9 cites

GPU-accelerated Guided Source Separation for Meeting Transcription2022 · 5 cites

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR2021 · 3 cites

FAST-RIR: Fast neural diffuse room impulse response generator2021 · 3 cites

Multi-Variant Consistency based Self-supervised Learning for Robust Automatic Speech Recognition2021 · 3 cites

Concurrent Speaker Detection: A multi-microphone Transformer-Based Approach2024 · 3 cites

Adapting self-supervised models to multi-talker speech recognition using speaker embeddings2022 · 2 cites

Progressive unsupervised domain adaptation for ASR using ensemble models and multi-stage training2024 · 2 cites

Ask2Mask: Guided Data Selection for Masked Speech Modeling2022 · 1 cites

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers2023 · 1 cites

SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition2023 · 1 cites

End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis2023 · 1 cites

XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models2024 · 1 cites

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization2021

All-neural beamformer for continuous speech separation2021

Effective Cross-Utterance Language Modeling for Conversational Speech Recognition2021

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition2022

CycleGAN-Based Unpaired Speech Dereverberation2022

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition2022

ESSumm: Extractive Speech Summarization from Untranscribed Meeting2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR2022

Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation2022

Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.02022

Speech separation with large-scale self-supervised learning2022

Leveraging Cross-Utterance Context For ASR Decoding2023

End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization2024

On Speaker Attribution with SURT2024

Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization2024

LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor Extraction2024

Improving Automatic Speech Recognition with Decoder-Centric Regularisation in Encoder-Decoder Models2024

Online speaker diarization of meetings guided by speech separation2024

AMI dataset — papers, benchmarks & downloads · Speech Audio