Leveraging Synthetic Audio Data For End-to-end Low-resource Speech Translation

·2024

arXiv:moslem2024leveraging ↗Google Scholar ↗Semantic Scholar ↗

Abstract

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.

Abstract

Related papers