Voxceleb-esp: Preliminary Experiments Detecting Spanish Celebrities From Their Voices
2023 · Beltrán Labrador, Manuel Otero-Gonzalez, Alicia Lozano-Diez, et al.
Abstract
This paper presents VoxCeleb-ESP, a collection of pointers and timestamps to YouTube videos facilitating the creation of a novel speaker recognition dataset. VoxCeleb-ESP captures real-world scenarios, incorporating diverse speaking styles, noises, and channel distortions. It includes 160 Spanish celebrities spanning various categories, ensuring a representative distribution across age groups and geographic regions in Spain. We provide two speaker trial lists for speaker identification tasks, each of them with same-video or different-video target trials respectively, accompanied by a cross-lingual evaluation of ResNet pretrained models. Preliminary speaker identification results suggest that the complexity of the detection task in VoxCeleb-ESP is equivalent to that of the original and much larger VoxCeleb in English. VoxCeleb-ESP contributes to the expansion of speaker recognition benchmarks with a comprehensive and diverse dataset for the Spanish language.
Authors
(none)
Tags
Stats
Related papers
- Voxceleb: A Large-scale Speaker Identification Dataset (2017)23.55
- Voxsrc 2019: The First Voxceleb Speaker Recognition Challenge (2019)0.00
- Voxceleb2: Deep Speaker Recognition (2018)23.96
- The Ins And Outs Of Speaker Recognition: Lessons From Voxsrc 2020 (2020)11.85
- Voxlingua107: A Dataset For Spoken Language Recognition (2020)14.15
- Voxsrc 2020: The Second Voxceleb Speaker Recognition Challenge (2020)0.00
- Voxblink2: A 100K+ Speaker Recognition Corpus And The Open-set Speaker-identification Benchmark (2024)9.41
- CN-CELEB: A Challenging Chinese Speaker Recognition Dataset (2019)16.39