Visualization: The Missing Factor In Simultaneous Speech Translation
2021 Β· Sara Papi, Matteo Negri, Marco Turchi
Abstract
Simultaneous speech translation (SimulST) is the task in which output generation has to be performed on partial, incremental speech input. In recent years, SimulST has become popular due to the spread of cross-lingual application scenarios, like international live conferences and streaming lectures, in which on-the-fly speech translation can facilitate users' access to audio-visual content. In this paper, we analyze the characteristics of the SimulST systems developed so far, discussing their strengths and weaknesses. We then concentrate on the evaluation framework required to properly assess systems' effectiveness. To this end, we raise the need for a broader performance analysis, also including the user experience standpoint. SimulST systems, indeed, should be evaluated not only in terms of quality/latency measures, but also via task-oriented metrics accounting, for instance, for the visualization strategy adopted. In light of this, we highlight which are the goals achieved by the co
Authors
(none)
Tags
Stats
Related papers
- Simulsense: Sense-driven Interpreting For Efficient Simultaneous Speech Translation (2025)0.00
- Efficient And Adaptive Simultaneous Speech Translation With Fully Unidirectional Architecture (2025)2.26
- CA*: Addressing Evaluation Pitfalls In Computation-aware Latency For Simultaneous Speech Translation (2024)0.00
- Does Simultaneous Speech Translation Need Simultaneous Models? (2022)4.52
- Exploring Continuous Integrate-and-fire For Adaptive Simultaneous Speech Translation (2022)4.52
- Tagged End-to-end Simultaneous Speech Translation Training Using Simultaneous Interpretation Data (2023)0.00
- Towards The Evaluation Of Automatic Simultaneous Speech Translation From A Communicative Perspective (2021)9.41
- Streamspeech: Simultaneous Speech-to-speech Translation With Multi-task Learning (2024)7.81