Lebenchmark 2.0: A Standardized, Replicable And Enhanced Framework For Self-supervised Representations Of French Speech
2023 Β· Titouan Parcollet, Ha Nguyen, Solene Evain, et al.
Abstract
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing and building SSL-equipped French speech technologies. It includes documented, large-scale and heterogeneous corpora with up to 14,000 hours of heterogeneous speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to one billion learnable parameters shared with the community, and an evaluation protocol made of six downstream tasks to complement existing benchmarks. LeBenchmark 2.0 also presents unique perspectives on pre-trained SSL models for speech with the investigation of frozen versus fine-tuned downstream models, task-agnostic versus task-specific pre-trained models as well as a discussi
Authors
(none)
Tags
Stats
Related papers
- Lebenchmark: A Reproducible Framework For Assessing Self-supervised Representation Learning From Speech (2021)11.39
- Speech Self-supervised Representations Benchmarking: A Case For Larger Probing Heads (2023)2.26
- ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, And Datasets (2024)4.52
- Investigating Self-supervised Learning For Speech Enhancement And Separation (2022)13.44
- SUPERB @ SLT 2022: Challenge On Generalization And Efficiency Of Self-supervised Speech Representation Learning (2022)9.23
- An Adapter Based Pre-training For Efficient And Scalable Self-supervised Speech Representation Learning (2021)8.35
- Analyzing The Factors Affecting Usefulness Of Self-supervised Pre-trained Representations For Speech Recognition (2022)0.00
- ML-SUPERB: Multilingual Speech Universal Performance Benchmark (2023)12.47