← all datasets

MuST-C

Emerging

50papers using it

2021first seen

MuST-C is a dataset used to evaluate simultaneous speech-to-speech translation across multiple languages.

🔎 Find this dataset

Papers using MuST-C (50)

SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation2026

Optimal Multi-Task Learning at Regularization Horizon for Speech Translation Task2025

InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model2025

Optimizing Speech Multi-View Feature Fusion through Conditional Computation2025

Speech Translation Refinement using Large Language Models2025

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation2022 · 32 cites

M3ST: Mix at Three Levels for Speech Translation2022 · 14 cites

Speechformer: Reducing Information Loss in Direct Speech Translation2021 · 8 cites

T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation2022 · 8 cites

Pre-training for Speech Translation: CTC Meets Optimal Transport2023 · 7 cites

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations2023 · 6 cites

Rethinking and Improving Multi-task Learning for End-to-end Speech Translation2023 · 4 cites

SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training2022 · 3 cites

Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement2021 · 2 cites

STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation2022 · 2 cites

GigaST: A 10,000-hour Pseudo Speech Translation Corpus2022 · 2 cites

Simple and Effective Unsupervised Speech Translation2022 · 2 cites

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation2022 · 2 cites

WACO: Word-Aligned Contrastive Learning for Speech Translation2022 · 2 cites

Tuning Large language model for End-to-end Speech Translation2023 · 2 cites

Learning When to Translate for Streaming Speech2021 · 1 cites

Efficient Speech Translation with Dynamic Latent Perceivers2022 · 1 cites

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations2022 · 1 cites

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks2023 · 1 cites

Understanding and Bridging the Modality Gap for Speech Translation2023 · 1 cites

CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation2023 · 1 cites

Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation2023 · 1 cites

Soft Alignment of Modality Space for End-to-end Speech Translation2023 · 1 cites

FASST: Fast LLM-based Simultaneous Speech Translation2024 · 1 cites

Unified Speech-Text Pre-training for Speech Translation and Recognition2022

Cross-modal Contrastive Learning for Speech Translation2022

Generating Synthetic Speech from SpokenVocab for Speech Translation2022

RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise2022

Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation2023

Improving speech translation by fusing speech and text2023

CTC-based Non-autoregressive Speech Translation2023

Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT232023

Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation2023

Implicit Memory Transformer for Computationally Efficient Simultaneous Speech Translation2023

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts2023

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation2023

Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition2023

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing2023

Pushing the Limits of Zero-shot End-to-End Speech Translation2024

SimulTron: On-Device Simultaneous Speech to Speech Translation2024

Task Arithmetic for Language Expansion in Speech Translation2024

Representation Purification for End-to-End Speech Translation2024

On the Impact of Noises in Crowd-Sourced Data for Speech Translation2022

Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning2023

Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff2023