Exploring Retraining-free Speech Recognition For Intra-sentential Code-switching
2021 · Zhen Huang, Xiaodan Zhuang, Daben Liu, et al.
Abstract
In this paper, we present our initial efforts for building a code-switching (CS) speech recognition system leveraging existing acoustic models (AMs) and language models (LMs), i.e., no training required, and specifically targeting intra-sentential switching. To achieve such an ambitious goal, new mechanisms for foreign pronunciation generation and language model (LM) enrichment have been devised. Specifically, we have designed an automatic approach to obtain high quality pronunciation of foreign language (FL) words in the native language (NL) phoneme set using existing acoustic phone decoders and an LSTM-based grapheme-to-phoneme (G2P) model. Improved accented pronunciations have thus been obtained by learning foreign pronunciations directly from data. Furthermore, a code-switching LM was deployed by converting the original NL LM into a CS LM using translated word pairs and borrowing statistics for the NL LM. Experimental evidence clearly demonstrates that our approach better deals wit
Authors
(none)
Tags
Stats
Related papers
- Code-switching Detection With Data-augmented Acoustic And Language Models (2018)3.58
- Acoustic And Textual Data Augmentation For Improved ASR Of Code-switching Speech (2018)9.92
- Generative Error Correction For Code-switching Speech Recognition Using Large Language Models (2023)0.00
- Integrating Knowledge In End-to-end Automatic Speech Recognition For Mandarin-english Code-switching (2021)5.24
- End-to-end Code-switching ASR For Low-resourced Language Pairs (2019)9.76
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- Code-switching Speech Recognition Under The Lens: Model- And Data-centric Perspectives (2025)0.00
- Enhancing Code-switched Text-to-speech Synthesis Capability In Large Language Models With Only Monolingual Corpora (2024)0.00