← all datasets

AIME-24/25

Emerging
3papers using it
2025first seen

The 'AIME24/25' dataset/benchmark is used to evaluate the performance of reinforcement learning models, particularly in the context of agentic reinforcement learning and their ability to handle noisy trajectories during problem-solving tasks.

Papers using AIME-24/25 (3)

AIME-24/25 β€” datasets β€” reinforcement-learning