← all datasets

VoxCeleb

Canonical

38papers using it

2021first seen

A speaker-recognition dataset of utterances from thousands of celebrities collected from YouTube.

🔎 Find this dataset

Papers using VoxCeleb (38)

Text-Independent Speaker Verification Using Discrete Audio Tokens2026

Neural Speaker Diarization via Multilingual Training: Evaluation on Low-Resource Nepali-Hindi Speech2026

TASLA: Text-Aligned Speech Tokens with Multiple Layer-Aggregation2025 · 2 cites

NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025 · 1 cites

IsoNet: Spatially-aware audio-visual target speech extraction in complex acoustic environments2026

Ring Mixing with Auxiliary Signal-to-Consistency-Error Ratio Loss for Unsupervised Denoising in Speech Separation2026

Vclip: Face-based Speaker Generation by Face-voice Association Learning2026

Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification2025

Magnitude and Phase-based Feature Fusion Using Co-attention Mechanism for Speaker recognition2025

Effective Modeling of Critical Contextual Information for TDNN-based Speaker Verification2025

Short-Segment Speaker Verification with Pre-trained Models and Multi-Resolution Encoder2025

Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization2025

Clustering-based hard negative sampling for supervised contrastive speaker verification2025

MGFF-TDNN: A Multi-Granularity Feature Fusion TDNN Model with Depth-Wise Separable Module for Speaker Verification2025

Multi-Frequency Information Enhanced Channel Attention Module for Speaker Representation Learning2022 · 21 cites

Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification2021 · 9 cites

EDITnet: A Lightweight Network for Unsupervised Domain Adaptation in Speaker Verification2022 · 9 cites

Disentangling Voice and Content with Self-Supervision for Speaker Recognition2023 · 5 cites

Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection2022 · 4 cites

CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking2023 · 3 cites

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio2021 · 1 cites

Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses2021 · 1 cites

Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification2022 · 1 cites

Improving Text-Independent Speaker Verification with Auxiliary Speakers Using Graph2021

Multi-query multi-head attention pooling and Inter-topK penalty for speaker verification2021

MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification2021

MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances2022

Toroidal Probabilistic Spherical Discriminant Analysis2022

Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments2022

Model Compression for DNN-based Speaker Verification Using Weight Quantization2022

Distance-based Weight Transfer from Near-field to Far-field Speaker Verification2023

Self-FiLM: Conditioning GANs with self-supervised representations for bandwidth extension based speaker recognition2023

Ordered and Binary Speaker Embedding2023

Leveraging ASR Pretrained Conformers for Speaker Verification through Transfer Learning and Knowledge Distillation2023

SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition2024

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling2024

M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions2024

Neural Scoring: A Refreshed End-to-End Approach for Speaker Recognition in Complex Conditions2024

VoxCeleb dataset — papers, benchmarks & downloads · Speech Audio