End-to-end Spoken Language Translation
2019 Β· Michelle Guo, Albert Haque, Prateek Verma
Abstract
In this paper, we address the task of spoken language understanding. We present a method for translating spoken sentences from one language into spoken sentences in another language. Given spectrogram-spectrogram pairs, our model can be trained completely from scratch to translate unseen sentences. Our method consists of a pyramidal-bidirectional recurrent network combined with a convolutional network to output sentence-level spectrograms in the target language. Empirically, our model achieves competitive performance with state-of-the-art methods on multiple languages and can generalize to unseen speakers.
Authors
(none)
Tags
Stats
Related papers
- Multilingual End-to-end Speech Translation (2019)0.00
- Towards End-to-end Spoken Language Understanding (2018)14.73
- Direct Speech-to-speech Translation With A Sequence-to-sequence Model (2019)15.13
- Long-form End-to-end Speech Translation Via Latent Alignment Segmentation (2023)0.00
- One-to-many Multilingual End-to-end Speech Translation (2019)9.23
- Soft Language Identification For Language-agnostic Many-to-one End-to-end Speech Translation (2024)0.00
- Speech-language Pre-training For End-to-end Spoken Language Understanding (2021)9.41
- End-to-end Speech-to-text Translation: A Survey (2023)0.00