Leveraging Translations For Speech Transcription In Low-resource Settings
2018 Β· Antonis Anastasopoulos, David Chiang
Abstract
Recently proposed data collection frameworks for endangered language documentation aim not only to collect speech in the language of interest, but also to collect translations into a high-resource language that will render the collected resource interpretable. We focus on this scenario and explore whether we can improve transcription quality under these extremely low-resource settings with the assistance of text translations. We present a neural multi-source model and evaluate several variations of it on three low-resource datasets. We find that our multi-source model with shared attention outperforms the baselines, reducing transcription character error rate by up to 12.3%.
Authors
(none)
Tags
Stats
Related papers
- Efficient Neural Speech Synthesis For Low-resource Languages Through Multilingual Modeling (2020)8.60
- Pretraining By Backtranslation For End-to-end ASR In Low-resource Settings (2018)0.00
- Strategies For Improving Low Resource Speech To Text Translation Relying On Pre-trained ASR Models (2023)5.24
- Multilingual Byte2speech Models For Scalable Low-resource Speech Synthesis (2021)0.00
- Generative Adversarial Training Data Adaptation For Very Low-resource Automatic Speech Recognition (2020)6.77
- Extending Multilingual Speech Synthesis To 100+ Languages Without Transcribed Data (2024)7.16
- End-to-end Text-to-speech For Low-resource Languages By Cross-lingual Transfer Learning (2019)0.00
- An Unsupervised Probability Model For Speech-to-translation Alignment Of Low-resource Languages (2016)6.77