← all datasets

SWE-bench Verified

Emerging
6papers using it
70,231HF downloads
94HF likes
2025first seen

Dataset Summary SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more details on the human-validation process. The dataset collects 500 test I

Papers using SWE-bench Verified (6)

SWE-bench Verified — datasets — llm-papers