On The End-to-end Solution To Mandarin-english Code-switching Speech Recognition
2018 Β· Zhiping Zeng, Yerbolat Khassanov, van Tung Pham, et al.
Abstract
Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances. In this work, we study end-to-end (E2E) approaches to the Mandarin-English code-switching speech recognition (CSSR) task. We first examine the effectiveness of using data augmentation and byte-pair encoding (BPE) subword units. More importantly, we propose a multitask learning recipe, where a language identification task is explicitly learned in addition to the E2E speech recognition task. Furthermore, we introduce an efficient word vocabulary expansion method for language modeling to alleviate data sparsity issues under the code-switching scenario. Experimental results on the SEAME data, a Mandarin-English CS corpus, demonstrate the effectiveness of the proposed methods.
Authors
(none)
Tags
Stats
Related papers
- Integrating Knowledge In End-to-end Automatic Speech Recognition For Mandarin-english Code-switching (2021)5.24
- Towards End-to-end Code-switching Speech Recognition (2018)0.00
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- An Effective Mixture-of-experts Approach For Code-switching Speech Recognition Leveraging Encoder Disentanglement (2024)0.00
- The ASRU 2019 Mandarin-english Code-switching Speech Recognition Challenge: Open Datasets, Tracks, Methods And Results (2020)0.00
- Data Augmentation For End-to-end Code-switching Speech Recognition (2020)9.92
- End-to-end Code-switching ASR For Low-resourced Language Pairs (2019)9.76
- Balanced End-to-end Monolingual Pre-training For Low-resourced Indic Languages Code-switching Speech Recognition (2021)0.00