Generative Error Correction For Code-switching Speech Recognition Using Large Language Models
2023 Β· Chen Chen, Yuchen Hu, Chao-Han Huck Yang, et al.
Abstract
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. Despite the recent advances in automatic speech recognition (ASR), CS-ASR is still a challenging task ought to the grammatical structure complexity of the phenomenon and the data scarcity of specific training corpus. In this work, we propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem. Specifically, we first employ multiple well-trained ASR models for N-best hypotheses generation, with the aim of increasing the diverse and informative elements in the set of hypotheses. Next, we utilize the LLMs to learn the hypotheses-to-transcription (H2T) mapping by adding a trainable low-rank adapter. Such a generative error correction (GER) method directly predicts the accurate transcription according to its expert linguistic knowledge and N-best hypotheses, resulting in a paradigm shift from the traditional language model r
Authors
(none)
Tags
Stats
Related papers
- Aligning Speech To Languages To Enhance Code-switching Speech Recognition (2024)5.84
- Enhancing Code-switched Text-to-speech Synthesis Capability In Large Language Models With Only Monolingual Corpora (2024)0.00
- ASR Error Correction Using Large Language Models (2024)9.41
- Exploring Retraining-free Speech Recognition For Intra-sentential Code-switching (2021)5.84
- Multi-stage Large Language Model Correction For Speech Recognition (2023)0.00
- Code-switching Speech Recognition Under The Lens: Model- And Data-centric Perspectives (2025)0.00
- Language Modeling For Code-switching: Evaluation, Integration Of Monolingual Data, And Discriminative Training (2018)5.24
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00