HumanEval humaneval-3 Leaderboard
Auto-discovered from papers reporting HumanEval (Success rate). Β· Metric: Success rate (higher is better)
| # | Model | Success rate | Paper |
|---|---|---|---|
| 1 | From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence | 95.00 | β |
| 2 | From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence | 95.00 | β |
| 3 | Benchmarking Large Language Models for ABAP Code Generation: An Empirical Study on Iterative Improvement by Compiler Feedback | 75.00 | β |
| 4 | Large Language Model Guided Self-Debugging Code Generation | 5.70 | β |