AIME-24/25
Emerging3papers using it
2025first seen
The 'AIME24/25' dataset/benchmark is used to evaluate the performance of reinforcement learning models, particularly in the context of agentic reinforcement learning and their ability to handle noisy trajectories during problem-solving tasks.