ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, And Datasets
2024 Β· Jiatong Shi, Shih-Heng Wang, William Chen, et al.
Abstract
ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models across downstream models, fine-tuning setups, and efficient model adaptation approaches. We find performance improvements over the setup of ML-SUPERB. However, performance depends on the downstream model design. Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches to improve multilingual ASR performance.
Authors
(none)
Tags
Stats
Related papers
- ML-SUPERB: Multilingual Speech Universal Performance Benchmark (2023)12.47
- Findings Of The 2023 ML-SUPERB Challenge: Pre-training And Evaluation Over More Languages And Beyond (2023)0.00
- SUPERB @ SLT 2022: Challenge On Generalization And Efficiency Of Self-supervised Speech Representation Learning (2022)9.23
- SUPERB-SG: Enhanced Speech Processing Universal Performance Benchmark For Semantic And Generative Capabilities (2022)13.34
- Characterizing The Adversarial Vulnerability Of Speech Self-supervised Learning (2021)4.52
- Lebenchmark: A Reproducible Framework For Assessing Self-supervised Representation Learning From Speech (2021)11.39
- Lebenchmark 2.0: A Standardized, Replicable And Enhanced Framework For Self-supervised Representations Of French Speech (2023)0.00
- Speech Self-supervised Representations Benchmarking: A Case For Larger Probing Heads (2023)2.26