#ModelPass@1Paper
1Gemini-Exp-120662.40β€”
2DeepSeek-V362.20β€”
3Llama-4-Maverick61.40β€”
4GPT-4o-2024-05-1361.10β€”
5Quasar-Alpha60.60β€”
6Gemini-2.0-Flash-Exp59.90β€”
7DeepSeek-Coder-V2-Instruct59.70β€”
8DeepSeek-V2-Chat (2024-06-28)59.40β€”
9Gemini-Exp-111459.30β€”
10GPT-4.1-Mini-2025-04-1459.30β€”
11Claude-3.5-Haiku-2024102259.00β€”
12GPT-4o-2024-11-2058.90β€”
13Claude-3.5-Sonnet-2024062058.60β€”
14GPT-4-Turbo-2024-04-0958.20β€”
15Gemini-Exp-112158.10β€”
16Qwen2.5-Coder-32B-Instruct58.00β€”
17Claude-3.5-Sonnet-2024102257.50β€”
18Gemini-1.5-Pro-API-051457.50β€”
19Llama-3.3-70B-Instruct57.50β€”
20Claude-3-Opus-2024022957.40β€”
21GPT-4o-mini-2024-07-1857.40β€”
22GPT-4-061357.20β€”
23Athene-V2-Chat56.80β€”
24Qwen2.5-Coder-14B-Instruct56.70β€”
25Athene-V2-Agent56.10β€”
26Qwen2.5-72B-Instruct55.90β€”
27Hermes-2-Theta-Llama-3-70B55.60β€”
28Phi-455.40β€”
29Gemini-1.5-Flash-API-051455.10β€”
30DeepSeek-R1-Distill-Qwen-32B54.90β€”
BigCodeBench (Complete) bigcodebench-complete Leaderboard