Is 42 The Answer To Everything In Subtitling-oriented Speech Translation?
2020 Β· Alina Karakanta, Matteo Negri, Marco Turchi
Abstract
Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily. Although Neural Machine Translation (NMT) can speed up the process of translating audiovisual content, large manual effort is still required for transcribing the source language, and for spotting and segmenting the text into proper subtitles. Creating proper subtitles in terms of timing and segmentation highly depends on information present in the audio (utterance duration, natural pauses). In this work, we explore two methods for applying Speech Translation (ST) to subtitling: a) a direct end-to-end and b) a classical cascade approach. We discuss the benefit of having access to the source language speech for improving the conformity of the generated subtitles to the spatial and temporal subtitling constraints and show that length is not the answer to everything in the case of subtitling-oriented ST.
Authors
(none)
Tags
Stats
Related papers
- Direct Speech Translation For Automatic Subtitling (2022)6.77
- Dodging The Data Bottleneck: Automatic Subtitling With Automatically Segmented ST Corpora (2022)2.26
- Between Flexibility And Consistency: Joint Generation Of Captions And Subtitles (2021)5.24
- Learning To Jointly Transcribe And Subtitle For End-to-end Spontaneous Speech Recognition (2022)5.84
- Leveraging Broadcast Media Subtitle Transcripts For Automatic Speech Recognition And Subtitling (2025)2.26
- Isochrony-controlled Speech-to-text Translation: A Study On Translating From Sino-tibetan To Indo-european Languages (2024)0.00
- How To Evaluate Speech Translation With Source-aware Neural MT Metrics (2025)0.00
- Large-scale Streaming End-to-end Speech Translation With Neural Transducers (2022)9.59