Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
2023 Β· Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, et al.
Abstract
We introduce a new zero resource code-switched speech benchmark designed to directly assess the code-switching capabilities of self-supervised speech encoders. We showcase a baseline system of language modeling on discrete units to demonstrate how the code-switching abilities of speech encoders can be assessed in a zero-resource manner. Our experiments encompass a variety of well-known speech encoders, including Wav2vec 2.0, HuBERT, XLSR, etc. We examine the impact of pre-training languages and model size on benchmark performance. Notably, though our results demonstrate that speech encoders with multilingual pre-training, exemplified by XLSR, outperform monolingual variants (Wav2vec 2.0, HuBERT) in code-switching scenarios, there is still substantial room for improvement in their code-switching linguistic abilities.
Authors
(none)
Tags
Stats
Related papers
- The Zero Resource Speech Benchmark 2021: Metrics And Baselines For Unsupervised Spoken Language Modeling (2020)0.00
- Multilingual Self-supervised Speech Representations Improve The Speech Recognition Of Low-resource African Languages With Codeswitching (2023)0.00
- The Zero Resource Speech Challenge 2020: Discovering Discrete Subword And Word Units (2020)11.58
- Benchmarking Evaluation Metrics For Code-switching Automatic Speech Recognition (2022)5.84
- Self-supervised Language Learning From Raw Audio: Lessons From The Zero Resource Speech Challenge (2022)10.07
- Constrained Output Embeddings For End-to-end Code-switching Speech Recognition With Only Monolingual Data (2019)7.16
- Switchlingua: The First Large-scale Multilingual And Multi-ethnic Code-switching Dataset (2025)0.00
- Language-agnostic Code-switching In Sequence-to-sequence Speech Recognition (2022)0.00