GEC-RAG: Improving Generative Error Correction Via Retrieval-augmented Generation For Automatic Speech Recognition Systems
2025 Β· Amin Robatian, Mohammad Hajipour, Mohammad Reza Peyghan, et al.
Abstract
Automatic Speech Recognition (ASR) systems have demonstrated remarkable performance across various applications. However, limited data and the unique language features of specific domains, such as low-resource languages, significantly degrade their performance and lead to higher Word Error Rates (WER). In this study, we propose Generative Error Correction via Retrieval-Augmented Generation (GEC-RAG), a novel approach designed to improve ASR accuracy for low-resource domains, like Persian. Our approach treats the ASR system as a black-box, a common practice in cloud-based services, and proposes a Retrieval-Augmented Generation (RAG) approach within the In-Context Learning (ICL) scheme to enhance the quality of ASR predictions. By constructing a knowledge base that pairs ASR predictions (1-best and 5-best hypotheses) with their corresponding ground truths, GEC-RAG retrieves lexically similar examples to the ASR transcription using the Term Frequency-Inverse Document Frequency (TF-IDF) me
Authors
(none)
Tags
Stats
Related papers
- Failing Forward: Improving Generative Error Correction For ASR With Synthetic Data And Retrieval Augmentation (2024)3.58
- La-rag:enhancing Llm-based ASR Accuracy With Retrieval-augmented Generation (2024)0.00
- Ed-cec: Improving Rare Word Recognition Using Asr Postprocessing Based On Error Detection And Context-aware Error Correction (2023)6.34
- Lipger: Visually-conditioned Generative Error Correction For Robust Automatic Speech Recognition (2024)2.26
- Channel-aware Domain-adaptive Generative Adversarial Network For Robust Speech Recognition (2024)4.52
- Efficient Acoustic Feature Transformation In Mismatched Environments Using A Guided-gan (2022)2.26
- Audiorag+: Feedback-driven Retrieval-augmented Audio Generation With Large Audio Language Models (2025)0.00
- G2G: Tts-driven Pronunciation Learning For Graphemic Hybrid ASR (2019)8.35