Multilingual End-to-end Speech Translation
2019 Β· Hirofumi Inaguma, Kevin Duh, Tatsuya Kawahara, et al.
Abstract
In this paper, we propose a simple yet effective framework for multilingual end-to-end speech translation (ST), in which speech utterances in source languages are directly translated to the desired target languages with a universal sequence-to-sequence architecture. While multilingual models have shown to be useful for automatic speech recognition (ASR) and machine translation (MT), this is the first time they are applied to the end-to-end ST problem. We show the effectiveness of multilingual end-to-end ST in two scenarios: one-to-many and many-to-many translations with publicly available data. We experimentally confirm that multilingual end-to-end ST models significantly outperform bilingual ones in both scenarios. The generalization of multilingual training is also evaluated in a transfer learning scenario to a very low-resource language pair. All of our codes and the database are publicly available to encourage further research in this emergent multilingual ST topic.
Authors
(none)
Tags
Stats
Related papers
- Joint Training And Decoding For Multilingual End-to-end Simultaneous Speech Translation (2025)0.95
- One-to-many Multilingual End-to-end Speech Translation (2019)9.23
- Towards Unsupervised Speech-to-text Translation (2018)0.00
- Leveraging Weakly Supervised Data To Improve End-to-end Speech-to-text Translation (2018)13.05
- Tackling Data Scarcity In Speech Translation Using Zero-shot Multilingual Machine Translation Techniques (2022)2.26
- Improving Cross-lingual Transfer Learning For End-to-end Speech Recognition With Speech Translation (2020)9.92
- Data Efficient Direct Speech-to-text Translation With Modality Agnostic Meta-learning (2019)0.00
- Long-form End-to-end Speech Translation Via Latent Alignment Segmentation (2023)0.00