Belebele

Name: Belebele
License: cc-by-sa-4.0

Emerging

3papers using it

26,809HF downloads

128HF likes

2024first seen

The Belebele Benchmark for Massively Multilingual NLU Evaluation Belebele is a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. This dataset enables the evaluation of mono- and multi-lingual models in high-, medium-, and low-resource languages. Each question has four multiple-

🤗 Hugging Face⚖ cc-by-sa-4.0

Papers using Belebele (3)

Layer-wise Swapping for Generalizable Multilingual Safety2026

Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation2025

Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement2024