A Two-stage Transliteration Approach To Improve Performance Of A Multilingual ASR
2024 Β· Rohit Kumar
Abstract
End-to-end Automatic Speech Recognition (ASR) systems are rapidly claiming to become state-of-art over other modeling methods. Several techniques have been introduced to improve their ability to handle multiple languages. However, due to variation in writing scripts for different languages, while decoding acoustically similar units, they do not always map to an appropriate grapheme in the target language. This restricts the scalability and adaptability of the model while dealing with multiple languages in code-mixing scenarios. This paper presents an approach to build a language-agnostic end-to-end model trained on a grapheme set obtained by projecting the multilingual grapheme data to the script of a more generic target language. This approach saves the acoustic model from retraining to span over a larger space and can easily be extended to multiple languages. A two-stage transliteration process realizes this approach and proves to minimize speech-class confusion. We performed experim
Authors
(none)
Tags
Stats
Related papers
- Multilingual Speech Recognition With A Single End-to-end Model (2017)16.05
- Transformer-transducers For Code-switched Speech Recognition (2020)10.97
- Towards One Model To Rule All: Multilingual Strategy For Dialectal Code-switching Arabic ASR (2021)9.03
- G2G: Tts-driven Pronunciation Learning For Graphemic Hybrid ASR (2019)8.35
- Multilingual Speech Recognition Using Knowledge Transfer Across Learning Processes (2021)0.00
- Investigating The Impact Of Cross-lingual Acoustic-phonetic Similarities On Multilingual Speech Recognition (2022)3.58
- End-to-end ASR For Code-switched Hindi-english Speech (2019)0.00
- Improving Cross-lingual Transfer Learning For End-to-end Speech Recognition With Speech Translation (2020)9.92