RE-Bench
Emerging2papers using it
2026first seen
RE-Bench is a dataset/benchmark designed to evaluate the performance of AI agents in understanding, reproducing, and extending research artifacts by providing structured, machine-executable research packages.