BeyondAIME
Emerging6papers using it
827HF downloads
18HF likes
2025first seen
BeyondAIME: Advancing Math Reasoning Evaluation Beyond High School Olympiads Dataset Description BeyondAIME is a curated test set designed to benchmark advanced mathematical reasoning. Its creation was guided by the following core principles to ensure a fair and challenging evaluation: High Difficulty: Problems are sou
π€ Hugging Faceβ cc0-1.0
Papers using BeyondAIME (6)
- UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning AbilitiesLearn Hard Problems During RL with Reference Guided Fine-tuningMarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline ParallelismEnigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable PuzzlesPrioritize the Process, Not Just the Outcome: Rewarding Latent Thought Trajectories Improves Reasoning in Looped Language ModelsEvolutionary System Prompt Learning for Reinforcement Learning in LLMs