Simultaneous Translation For Unsegmented Input: A Sliding Window Approach
2022 · Sukanta Sen, Ondřej Bojar, Barry Haddow
Abstract
In the cascaded approach to spoken language translation (SLT), the ASR output is typically punctuated and segmented into sentences before being passed to MT, since the latter is typically trained on written text. However, erroneous segmentation, due to poor sentence-final punctuation by the ASR system, leads to degradation in translation quality, especially in the simultaneous (online) setting where the input is continuously updated. To reduce the influence of automatic segmentation, we present a sliding window approach to translate raw ASR outputs (online or offline) without needing to rely on an automatic segmenter. We train translation models using parallel windows (instead of parallel sentences) extracted from the original training data. At test time, we translate at the window level and join the translated windows using a simple approach to generate the final translation. Experiments on English-to-German and English-to-Czech show that our approach improves 1.3--2.0 BLEU points ove
Authors
(none)
Tags
Stats
Related papers
- Don't Discard Fixed-window Audio Segmentation In Speech-to-text Translation (2022)0.00
- Impact Of Encoding And Segmentation Strategies On End-to-end Simultaneous Speech Translation (2021)4.52
- Long-form End-to-end Speech Translation Via Latent Alignment Segmentation (2023)0.00
- Direct Simultaneous Speech-to-text Translation Assisted By Synchronized Streaming ASR (2021)6.77
- Fluent And Low-latency Simultaneous Speech-to-speech Translation With Self-adaptive Training (2020)3.58
- Subtitles To Segmentation: Improving Low-resource Speech-to-text Translation Pipelines (2020)0.00
- Long-form Speech Translation Through Segmentation With Finite-state Decoding Constraints On Large Language Models (2023)0.00
- Re-translation Strategies For Long Form, Simultaneous, Spoken Language Translation (2019)9.23