MMLU-Pro
Emerging14papers using it
168,573HF downloads
485HF likes
2025first seen
MMLU-Pro Dataset MMLU-Pro dataset is a more robust and challenging massive multi-task understanding dataset tailored to more rigorously benchmark large language models' capabilities. This dataset contains 12K complex questions across various disciplines. |Github | 🏆Leaderboard | 📖Paper | 🚀 What's New [2026.03.11] Ad
🤗 Hugging Face⚖ mit
Papers using MMLU-Pro (14)
- Nemotron-CrossThink: Scaling Self-Learning beyond Math ReasoningOpen-Medical-R1: How to Choose Data for RLVR Training at Medicine DomainWarm Up Before You Train: Unlocking General Reasoning in Resource-Constrained SettingsApriel-1.5-OpenReasoner: RL Post-Training for General-Purpose and Efficient ReasoningFlipLLM: Efficient Bit-Flip Attacks on Multimodal LLMs using Reinforcement LearningTraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM ReasoningGroup-Aware Reinforcement Learning for Output Diversity in Large Language ModelsStep-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMsReinforcing General Reasoning without VerifiersReinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-TrainingEvolving Language Models without Labels: Majority Drives Selection, Novelty Promotes VariationCan LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM ReasoningReinforcement Inference: Leveraging Uncertainty for Self-Correcting Language Model ReasoningSUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions