Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
2023 Β· Devang Kulshreshtha, Saket Dingliwal, Brady Houston, et al.
Abstract
Connectionist Temporal Classification (CTC) models are popular for their balance between speed and performance for Automatic Speech Recognition (ASR). However, these CTC models still struggle in other areas, such as personalization towards custom words. A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom entities. While this approach works well with enough data, we showcase that it isn't an effective strategy for low-resource languages. In this work, we propose a supervision loss for smoother training of the Contextual Adapters. Further, we explore a multilingual strategy to improve performance with limited training data. Our method achieves 48% F1 improvement in retrieving unseen custom entities for a low-resource language. Interestingly, as a by-product of training the Contextual Adapters, we see a 5-11% Word Error Rate (WER) reduction in the performance of the base CTC model as well.
Authors
(none)
Tags
Stats
Related papers
- Towards Personalization Of CTC Speech Recognition Models With Contextual Adapters And Adaptive Boosting (2022)0.00
- Multilingual Training And Cross-lingual Adaptation On Ctc-based Acoustic Model (2017)0.00
- Sequence-based Multi-lingual Low Resource Speech Recognition (2018)12.40
- Contextual Adapters For Personalized Speech Recognition In Neural Transducers (2022)12.47
- Fast Context-biasing For CTC And Transducer ASR Models With Ctc-based Word Spotter (2024)2.26
- Fast Contextual Adaptation With Neural Associative Memory For On-device Personalized Speech Recognition (2021)9.76
- Attention-based Contextual Language Model Adaptation For Speech Recognition (2021)0.00
- Reducing Spelling Inconsistencies In Code-switching ASR Using Contextualized CTC Loss (2020)4.52