← all datasets

Humanity's Last Exam

Emerging
4papers using it
2025first seen

'Humanity's Last Exam' is a benchmark used to evaluate the performance of search agents, specifically assessing their capabilities in high-difficulty tasks.

Papers using Humanity's Last Exam (4)

Humanity's Last Exam β€” datasets β€” llm-papers