Code-switching Without Switching: Language Agnostic End-to-end Speech Translation
2022 Β· Christian Huber, Enes Yavuz Ugan, Alexander Waibel
Abstract
We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance. With increasing globalization, multiple languages are increasingly used interchangeably during fluent speech. Such CS complicates traditional speech recognition and translation, as we must recognize which language was spoken first and then apply a language-dependent recognizer and subsequent translation component to generate the desired target language output. Such a pipeline introduces latency and errors. In this paper, we eliminate the need for that, by treating speech recognition and translation as one unified end-to-end speech translation problem. By training LAST with both input languages, we decode speech into one target language, regardless of the input language. LAST delivers comparable recognition and speech translation accuracy in monolingual usage, while reducing latency and error rate considerably when CS is observed.
Authors
(none)
Tags
Stats
Related papers
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00
- Code-switching Speech Recognition Under The Lens: Model- And Data-centric Perspectives (2025)0.00
- Integrating Knowledge In End-to-end Automatic Speech Recognition For Mandarin-english Code-switching (2021)5.24
- Aligning Speech To Languages To Enhance Code-switching Speech Recognition (2024)5.84
- On The End-to-end Solution To Mandarin-english Code-switching Speech Recognition (2018)12.10
- End-to-end Code-switching ASR For Low-resourced Language Pairs (2019)9.76
- Exploring Retraining-free Speech Recognition For Intra-sentential Code-switching (2021)5.84
- Towards End-to-end Code-switching Speech Recognition (2018)0.00