← all datasets

AIME 2025-2026

Emerging
1papers using it
2026first seen

The 'AIME 2025--2026' dataset/benchmark contains reasoning tasks and is used to evaluate the performance of models in reasoning-intensive problems, particularly in the context of retrieval-augmented generation.

AIME 2025-2026 β€” datasets β€” recommender-systems