Performance Improvements Of Probabilistic Transcript-adapted ASR With Recurrent Neural Network And Language-specific Constraints
2016 Β· Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson
Abstract
Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language.Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages. In this work, we describe two techniques to refine these probabilistic transcriptions: a noisy-channel model of non-native phone misperception is trained using a recurrent neural net-work, and decoded using minimally-resourced language-dependent pronunciation constraints. Both innovations improve quality of the transcript, and both innovations reduce phone error rate of a trainedASR, by 7% and 9% respectively
Authors
(none)
Tags
Stats
Related papers
- Robust Neural Machine Translation For Clean And Noisy Speech Transcripts (2019)0.00
- Audio-attention Discriminative Language Model For ASR Rescoring (2019)9.23
- Residual Adapters For Parameter-efficient ASR Adaptation To Atypical And Accented Speech (2021)10.74
- A Two-stage Transliteration Approach To Improve Performance Of A Multilingual ASR (2024)0.00
- Improving RNN Transducer Based ASR With Auxiliary Tasks (2020)9.59
- Pretraining By Backtranslation For End-to-end ASR In Low-resource Settings (2018)0.00
- Generating Human Readable Transcript For Automatic Speech Recognition With Pre-trained Language Model (2021)0.00
- Patcorrect: Non-autoregressive Phoneme-augmented Transformer For ASR Error Correction (2023)0.00