Improving Cascaded Unsupervised Speech Translation With Denoising Back-translation
2023 Β· Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, et al.
Abstract
Most of the speech translation models heavily rely on parallel data, which is hard to collect especially for low-resource languages. To tackle this issue, we propose to build a cascaded speech translation system without leveraging any kind of paired data. We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS. The results show that our work is comparable with some other early supervised methods in some language pairs. While cascaded systems always suffer from severe error propagation problems, we proposed denoising back-translation (DBT), a novel approach to building robust unsupervised neural machine translation (UNMT). DBT successfully increases the BLEU score by 0.7--0.9 in all three translation directions. Moreover, we simplified the pipeline of our cascaded system to reduce inference latency and conducted a comprehensive analysis of every part of our work. We also demonstrate our unsupervised speech translation results on the est
Authors
(none)
Tags
Stats
Related papers
- Leveraging Unsupervised And Weakly-supervised Data To Improve Direct Speech-to-speech Translation (2022)8.35
- Cascaded Models With Cyclic Feedback For Direct Speech Translation (2020)5.24
- Towards Unsupervised Speech-to-text Translation (2018)0.00
- Tight Integrated End-to-end Training For Cascaded Speech Translation (2020)8.35
- Leveraging Weakly Supervised Data To Improve End-to-end Speech-to-text Translation (2018)13.05
- When End-to-end Is Overkill: Rethinking Cascaded Speech-to-text Translation (2025)0.00
- Translatotron 3: Speech To Speech Translation With Monolingual Data (2023)8.09
- Enhanced Direct Speech-to-speech Translation Using Self-supervised Pre-training And Data Augmentation (2022)10.85