Joint Training And Decoding For Multilingual End-to-end Simultaneous Speech Translation
2025 Β· Wuwei Huang, Renren Jin, Wen Zhang, et al.
Abstract
Recent studies on end-to-end speech translation(ST) have facilitated the exploration of multilingual end-to-end ST and end-to-end simultaneous ST. In this paper, we investigate end-to-end simultaneous speech translation in a one-to-many multilingual setting which is closer to applications in real scenarios. We explore a separate decoder architecture and a unified architecture for joint synchronous training in this scenario. To further explore knowledge transfer across languages, we propose an asynchronous training strategy on the proposed unified decoder architecture. A multi-way aligned multilingual end-to-end ST dataset was curated as a benchmark testbed to evaluate our methods. Experimental results demonstrate the effectiveness of our models on the collected dataset. Our codes and data are available at: https://github.com/XiaoMi/TED-MMST.
Authors
(none)
Tags
Stats
Code
Related papers
- Multilingual End-to-end Speech Translation (2019)0.00
- One-to-many Multilingual End-to-end Speech Translation (2019)9.23
- Synchronous Speech Recognition And Speech-to-text Translation With Interactive Decoding (2019)10.48
- Direct Simultaneous Speech-to-text Translation Assisted By Synchronized Streaming ASR (2021)6.77
- Dual-decoder Transformer For Joint Automatic Speech Recognition And Multilingual Speech Translation (2020)13.73
- Rethinking And Improving Multi-task Learning For End-to-end Speech Translation (2023)5.84
- Tagged End-to-end Simultaneous Speech Translation Training Using Simultaneous Interpretation Data (2023)0.00
- Data Efficient Direct Speech-to-text Translation With Modality Agnostic Meta-learning (2019)0.00