Multi-staged Cross-lingual Acoustic Model Adaption For Robust Speech Recognition In Real-world Applications -- A Case Study On German Oral History Interviews
2020 Β· Michael Gref, Oliver Walter, Christoph Schmidt, et al.
Abstract
While recent automatic speech recognition systems achieve remarkable performance when large amounts of adequate, high quality annotated speech data is used for training, the same systems often only achieve an unsatisfactory result for tasks in domains that greatly deviate from the conditions represented by the training data. For many real-world applications, there is a lack of sufficient data that can be directly used for training robust speech recognition systems. To address this issue, we propose and investigate an approach that performs a robust acoustic model adaption to a target domain in a cross-lingual, multi-staged manner. Our approach enables the exploitation of large-scale training data from other domains in both the same and other languages. We evaluate our approach using the challenging task of German oral history interviews, where we achieve a relative reduction of the word error rate by more than 30% compared to a model trained from scratch only on the target domain, and
Authors
(none)
Tags
Stats
Related papers
- Human And Automatic Speech Recognition Performance On German Oral History Interviews (2022)0.00
- Investigating The Impact Of Cross-lingual Acoustic-phonetic Similarities On Multilingual Speech Recognition (2022)3.58
- A Highly Adaptive Acoustic Model For Accurate Multi-dialect Speech Recognition (2022)10.85
- Massively Multilingual Adversarial Speech Recognition (2019)11.93
- Learning Cross-lingual Mappings For Data Augmentation To Improve Low-resource Speech Recognition (2023)0.00
- Toward Domain-invariant Speech Recognition Via Large Scale Training (2018)13.39
- Multilingual Training And Cross-lingual Adaptation On Ctc-based Acoustic Model (2017)0.00
- Cross-lingual Low Resource Speaker Adaptation Using Phonological Features (2021)5.24