Textual Data Augmentation For Arabic-english Code-switching Speech Recognition
2022 · Amir Hussein, Shammur Absar Chowdhury, Ahmed Abdelali, et al.
Abstract
The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with monolingual data. In this work, we propose a zero-shot learning methodology for CS-ASR by augmenting the monolingual data with artificially generating CS text. We based our approach on random lexical replacements and Equivalence Constraint (EC) while exploiting aligned translation pairs to generate random and grammatically valid CS content. Our empirical results show a 65.5% relative reduction in language model perplexity, and 7.7% in ASR WER on two ecologically valid CS test sets. The human evaluation of the generated text using EC suggests that more than 80% is of adequate quality.
Authors
(none)
Tags
Stats
Related papers
- Investigating Lexical Replacements For Arabic-english Code-switched Data Augmentation (2022)5.84
- Speech Collage: Code-switched Audio Generation By Collaging Monolingual Corpora (2023)3.58
- Acoustic And Textual Data Augmentation For Improved ASR Of Code-switching Speech (2018)9.92
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- Data Augmentation For End-to-end Code-switching Speech Recognition (2020)9.92
- Code-switching Speech Recognition Under The Lens: Model- And Data-centric Perspectives (2025)0.00
- Code-switching Detection With Data-augmented Acoustic And Language Models (2018)3.58
- Exploring Retraining-free Speech Recognition For Intra-sentential Code-switching (2021)5.84