Improved Language Identification Through Cross-lingual Self-supervised Learning
2021 Β· Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, et al.
Abstract
Language identification greatly impacts the success of downstream tasks such as automatic speech recognition. Recently, self-supervised speech representations learned by wav2vec 2.0 have been shown to be very effective for a range of speech tasks. We extend previous self-supervised work on language identification by experimenting with pre-trained models which were learned on real-world unconstrained speech in multiple languages and not just on English. We show that models pre-trained on many languages perform better and enable language identification systems that require very little labeled data to perform well. Results on a 26 languages setup show that with only 10 minutes of labeled data per language, a cross-lingually pre-trained model can achieve over 89.2% accuracy.
Authors
(none)
Tags
Stats
Related papers
- Accidental Learners: Spoken Language Identification In Multilingual Self-supervised Models (2022)5.84
- Improved Self-supervised Multilingual Speech Representation Learning Combined With Auxiliary Language Information (2022)0.00
- Exploring Wav2vec 2.0 On Speaker Verification And Language Identification (2020)15.59
- An Adapter Based Pre-training For Efficient And Scalable Self-supervised Speech Representation Learning (2021)8.35
- Joint Unsupervised And Supervised Learning For Context-aware Language Identification (2023)2.26
- XLST: Cross-lingual Self-training To Learn Multilingual Representation For Low Resource Speech Recognition (2021)8.82
- Maximizing Data Efficiency For Cross-lingual TTS Adaptation By Self-supervised Representation Mixing And Embedding Initialization (2024)0.00
- Supervised Acoustic Embeddings And Their Transferability Across Languages (2023)0.00