Improving Speech Translation By Understanding And Learning From The Auxiliary Text Translation Task
2021 Β· Yun Tang, Juan Pino, Xian Li, et al.
Abstract
Pretraining and multitask learning are widely used to improve the speech to text translation performance. In this study, we are interested in training a speech to text translation model along with an auxiliary text to text translation task. We conduct a detailed analysis to understand the impact of the auxiliary task on the primary task within the multitask learning framework. Our analysis confirms that multitask learning tends to generate similar decoder representations from different modalities and preserve more information from the pretrained text translation modules. We observe minimal negative transfer effect between the two tasks and sharing more parameters is helpful to transfer knowledge from the text task to the speech task. The analysis also reveals that the modality representation difference at the top decoder layers is still not negligible, and those layers are critical for the translation quality. Inspired by these findings, we propose three methods to improve translation
Authors
(none)
Tags
Stats
Related papers
- Rethinking And Improving Multi-task Learning For End-to-end Speech Translation (2023)5.84
- Improving Cross-lingual Transfer Learning For End-to-end Speech Recognition With Speech Translation (2020)9.92
- Multitask Training With Text Data For End-to-end Speech Recognition (2020)7.50
- Improved Self-supervised Multilingual Speech Representation Learning Combined With Auxiliary Language Information (2022)0.00
- Data Efficient Direct Speech-to-text Translation With Modality Agnostic Meta-learning (2019)0.00
- Joint Training And Decoding For Multilingual End-to-end Simultaneous Speech Translation (2025)0.95
- Hierarchical Multitask Learning For Ctc-based Speech Recognition (2018)0.00
- A Comparative Study On End-to-end Speech To Text Translation (2019)11.67