End-to-end Code-switching ASR For Low-resourced Language Pairs
2019 · Xianghu Yue, Grandee Lee, Emre Yılmaz, et al.
Abstract
Despite the significant progress in end-to-end (E2E) automatic speech recognition (ASR), E2E ASR for low resourced code-switching (CS) speech has not been well studied. In this work, we describe an E2E ASR pipeline for the recognition of CS speech in which a low-resourced language is mixed with a high resourced language. Low-resourcedness in acoustic data hinders the performance of E2E ASR systems more severely than the conventional ASR systems.~To mitigate this problem in the transcription of archives with code-switching Frisian-Dutch speech, we integrate a designated decoding scheme and perform rescoring with neural network-based language models to enable better utilization of the available textual resources. We first incorporate a multi-graph decoding approach which creates parallel search spaces for each monolingual and mixed recognition tasks to maximize the utilization of the textual resources from each language. Further, language model rescoring is performed using a recurrent ne
Authors
(none)
Tags
Stats
Related papers
- Balanced End-to-end Monolingual Pre-training For Low-resourced Indic Languages Code-switching Speech Recognition (2021)0.00
- Multi-graph Decoding For Code-switching ASR (2019)4.52
- Acoustic And Textual Data Augmentation For Improved ASR Of Code-switching Speech (2018)9.92
- Code-switching Detection With Data-augmented Acoustic And Language Models (2018)3.58
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- Towards End-to-end Code-switching Speech Recognition (2018)0.00
- Code-switching Speech Recognition Under The Lens: Model- And Data-centric Perspectives (2025)0.00
- Semi-supervised Acoustic Model Training For Speech With Code-switching (2018)7.81