Contrastive Feedback Mechanism For Simultaneous Speech Translation
2024 Β· Haotian Tan, Sakriani Sakti
Abstract
Recent advances in simultaneous speech translation (SST) focus on the decision policies that enable the use of offline-trained ST models for simultaneous inference. These decision policies not only control the quality-latency trade-off in SST but also mitigate the impact of unstable predictions on translation quality by delaying translation for more context or discarding these predictions through stable hypothesis detection. However, these policies often overlook the potential benefits of utilizing unstable predictions. We introduce the contrastive feedback mechanism (CFM) for SST, a novel method that leverages these unstable predictions as feedback to improve translation quality. CFM guides the system to eliminate undesired model behaviors from these predictions through a contrastive objective. The experiments on 3 state-of-the-art decision policies across 8 languages in the MuST-C v1.0 dataset show that CFM effectively improves the performance of SST.
Authors
(none)
Tags
Stats
Related papers
- Exploring Continuous Integrate-and-fire For Adaptive Simultaneous Speech Translation (2022)4.52
- Simuls2s-llm: Unlocking Simultaneous Inference Of Speech Llms For Speech-to-speech Translation (2025)3.58
- Learning When To Speak: Latency And Quality Trade-offs For Simultaneous Speech-to-speech Translation With Offline Models (2023)0.00
- Efficient And Adaptive Simultaneous Speech Translation With Fully Unidirectional Architecture (2025)2.26
- CTC-GMM: CTC Guided Modality Matching For Fast And Accurate Streaming Speech Translation (2024)3.58
- SLM-S2ST: A Multimodal Language Model For Direct Speech-to-speech Translation (2025)0.00
- Does Simultaneous Speech Translation Need Simultaneous Models? (2022)4.52
- Label-synchronous Neural Transducer For E2E Simultaneous Speech Translation (2024)0.00