Sequence-based Multi-lingual Low Resource Speech Recognition
2018 Β· Siddharth Dalmia, Ramon Sanabria, Florian Metze, et al.
Abstract
Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss. We show that our model improves performance on Babel languages by over 6% absolute in terms of word/phoneme error rate when compared to mono-lingual systems built in the same setting for these languages. We also show that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data. We show that training on multiple languages is important for very low resource cross-lingual target
Authors
(none)
Tags
Stats
Related papers
- Analysis Of Multilingual Sequence-to-sequence Speech Recognition Systems (2018)0.00
- Multilingual Sequence-to-sequence Speech Recognition: Architecture, Transfer Learning, And Language Modeling (2018)13.84
- Adaptive Activation Network For Low Resource Multilingual Speech Recognition (2022)0.00
- Multilingual End-to-end Speech Recognition With A Single Transformer On Low-resource Languages (2018)0.00
- Learning Cross-lingual Mappings For Data Augmentation To Improve Low-resource Speech Recognition (2023)0.00
- Multilingual Speech Recognition With A Single End-to-end Model (2017)16.05
- Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages (2023)4.52
- Exploring End-to-end Techniques For Low-resource Speech Recognition (2018)5.84