← all datasets

HumanEval

Canonical
3papers using it
1,780HF downloads
95HF likes
2026first seen

HumanEval-X is a benchmark for the evaluation of the multilingual ability of code generative models. It consists of 820 high-quality human-crafted data samples (each with test cases) in Python, C++, Java, JavaScript, and Go, and can be used for various tasks.

Papers using HumanEval (1)

HumanEval β€” datasets β€” ai-for-code