Automatic Speech Recognition Advancements For Indigenous Languages Of The Americas
2024 Β· Monica Romero, Sandra Gomez, Ivan G. Torre
Abstract
Indigenous languages are a fundamental legacy in the development of human communication, embodying the unique identity and culture of local communities in America. The Second AmericasNLP (Americas Natural Language Processing) Competition Track 1 of NeurIPS (Neural Information Processing Systems) 2022 proposed the task of training automatic speech recognition (ASR) systems for five Indigenous languages: Quechua, Guarani, Bribri, Kotiria, and Wa'ikhana. In this paper, we describe the fine-tuning of a state-of-the-art ASR model for each target language, using approximately 36.65 h of transcribed speech data from diverse sources enriched with data augmentation methods. We systematically investigate, using a Bayesian search, the impact of the different hyperparameters on the Wav2vec2.0 XLS-R (Cross-Lingual Speech Representations) variants of 300 M and 1 B parameters. Our findings indicate that data and detailed hyperparameter tuning significantly affect ASR accuracy, but language complexity
Authors
(none)
Tags
Stats
Related papers
- Building Robust And Scalable Multilingual ASR For Indian Languages (2025)0.00
- Dialect Adaptation And Data Augmentation For Low-resource ASR: Taltech Systems For The MADASR 2023 Challenge (2023)6.34
- What Shall We Do With An Hour Of Data? Speech Recognition For The Un- And Under-served Languages Of Common Voice (2021)0.00
- Exploring The Impact Of Data Quantity On ASR In Extremely Low-resource Languages (2024)0.00
- Generative Adversarial Training Data Adaptation For Very Low-resource Automatic Speech Recognition (2020)6.77
- Multilingual Speech Recognition With A Single End-to-end Model (2017)16.05
- Towards End-to-end Training Of Automatic Speech Recognition For Nigerian Pidgin (2020)0.00
- Indicvoices-r: Unlocking A Massive Multilingual Multi-speaker Speech Corpus For Scaling Indian TTS (2024)2.26