Pyannote.audio: Neural Building Blocks For Speaker Diarization
2019 Β· HervΓ© Bredin, Ruiqing Yin, Juan Manuel Coria, et al.
Abstract
We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding -- reaching state-of-the-art performance for most of them.
Authors
(none)
Tags
Stats
Related papers
- Transcribe-to-diarize: Neural Speaker Diarization For Unlimited Number Of Speakers Using End-to-end Speaker-attributed ASR (2021)11.49
- A Toolkit For Joint Speaker Diarization And Identification With Application To Speaker-attributed ASR (2024)0.00
- 3d-speaker-toolkit: An Open-source Toolkit For Multimodal Speaker Verification And Diarization (2024)6.93
- Speaker Diarization Using Deep Recurrent Convolutional Neural Networks For Speaker Embeddings (2017)9.41
- EEND-SS: Joint End-to-end Neural Speaker Diarization And Speech Separation For Flexible Number Of Speakers (2022)10.35
- Taltech-irit-lis Speaker And Language Diarization Systems For DISPLACE 2024 (2024)4.52
- Spot The Conversation: Speaker Diarisation In The Wild (2020)15.31
- Sequence-to-sequence Neural Diarization With Automatic Speaker Detection And Representation (2024)6.34