VoxLingua-107

Emerging

6papers using it

2021first seen

VoxLingua-107 is a dataset used for evaluating language and dialect identification models, containing diverse spoken samples across multiple languages and dialects.

🔎 Find this dataset

Papers using VoxLingua-107 (6)

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification2022 · 4 cites

Joint unsupervised and supervised learning for context-aware language identification2023 · 2 cites

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models2022 · 1 cites

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale2021

Efficient Spoken Language Recognition via Multilabel Classification2023

Towards spoken dialect identification of Irish2023