AIME-24
Emerging36papers using it
6,620HF downloads
18HF likes
2024first seen
AIME 24 American Invitational Mathematics Examination (AIME) 2024 Citation If you use the AIME24 dataset in your research, please consider citing it as follows: @misc{aime24, title={American Invitational Mathematics Examination (AIME) 2024}, author={Zhang, Yifan and Math-AI, Team}, year={2024}, }
π€ Hugging Faceβ apache-2.0
Papers using AIME-24 (36)
- QuestA: Expanding Reasoning Capacity in LLMs via Question AugmentationSqueeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language ModelThinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient ReasonersMitigating Distribution Sharpening in Math RLVR via Distribution-Aligned Hint Synthesis and Backward Hint AnnealingLearn Hard Problems During RL with Reference Guided Fine-tuningOff-Policy Value-Based Reinforcement Learning for Large Language ModelsInftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement LearningiGRPO: Self-Feedback-Driven LLM ReasoningLatent Poincar\'e Shaping for Agentic Reinforcement LearningLong Chain-of-Thought Compression via Fine-Grained Group Policy OptimizationMasked-and-Reordered Self-Supervision for Reinforcement Learning from Verifiable RewardsGRPO-$\lambda$: Credit Assignment improves LLM ReasoningTowards High Data Efficiency in Reinforcement Learning with Verifiable RewardDCPO: Dynamic Clipping Policy OptimizationSimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated ReasoningSPEC-RL: Accelerating On-Policy Reinforcement Learning with Speculative RolloutsEnhancing Math Reasoning in Small-sized LLMs via Preview Difficulty-Aware InterventionFirst Return, Entropy-Eliciting ExploreSRPO: A Cross-Domain Implementation of Large-Scale Reinforcement
Learning on LLMReinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn'tQwen2.5-Math Technical Report: Toward Mathematical Expert Model via
Self-ImprovementBeyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM ReasoningLearn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement LearningProcess Reward Models That ThinkOn the Design of KL-Regularized Policy Gradient Algorithms for LLM ReasoningSkywork Open Reasoner 1 Technical ReportPromoting Efficient Reasoning with Verifiable Stepwise RewardEvolving Language Models without Labels: Majority Drives Selection, Novelty Promotes VariationSIRI: Scaling Iterative Reinforcement Learning with Interleaved CompressionScaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM ReasoningCan LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM ReasoningGenerative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement LearningPrompt Augmentation Scales up GRPO Training on Mathematical ReasoningPrioritize the Process, Not Just the Outcome: Rewarding Latent Thought Trajectories Improves Reasoning in Looped Language ModelsSortedRL: Accelerating RL Training for LLMs through Online Length-Aware SchedulingASI-Evolve: AI Accelerates AI