Improving Uyghur ASR Systems With Decoders Using Morpheme-based Language Models
2020 · Zicheng Qiu, Wei Jiang, Turghunjan Mamut
Abstract
Uyghur is a minority language, and its resources for Automatic Speech Recognition (ASR) research are always insufficient. THUYG-20 is currently the only open-sourced dataset of Uyghur speeches. State-of-the-art results of its clean and noiseless speech test task haven't been updated since the first release, which shows a big gap in the development of ASR between mainstream languages and Uyghur. In this paper, we try to bridge the gap by ultimately optimizing the ASR systems, and by developing a morpheme-based decoder, MLDG-Decoder (Morpheme Lattice Dynamically Generating Decoder for Uyghur DNN-HMM systems), which has long been missing. We have open-sourced the decoder. The MLDG-Decoder employs an algorithm, named as "on-the-fly composition with FEBABOS", to allow the back-off states and transitions to play the role of a relay station in on-the-fly composition. The algorithm empowers the dynamically generated graph to constrain the morpheme sequences in the lattices as effectively as th
Authors
(none)
Tags
Stats
Related papers
- Memory Augmented Lookup Dictionary Based Language Modeling For Automatic Speech Recognition (2022)0.00
- Multilingual And Fully Non-autoregressive ASR With Large Language Model Fusion: A Comprehensive Study (2024)0.00
- Towards One Model To Rule All: Multilingual Strategy For Dialectal Code-switching Arabic ASR (2021)9.03
- Let's Fuse Step By Step: A Generative Fusion Decoding Algorithm With Llms For Robust And Instruction-aware ASR And OCR (2024)0.00
- UME: Upcycling Mixture-of-experts For Scalable And Efficient Automatic Speech Recognition (2024)2.26
- Multi-graph Decoding For Code-switching ASR (2019)4.52
- Multilingual Sequence-to-sequence Speech Recognition: Architecture, Transfer Learning, And Language Modeling (2018)13.84
- Acoustic And Textual Data Augmentation For Improved ASR Of Code-switching Speech (2018)9.92