#ModelEloPaper
1gpt-5.51543.99β€”
2qwen3.5-max-preview1536.99β€”
3claude-opus-4-61534.61β€”
4claude-opus-4-6-thinking1533.45β€”
5kimi-k2.61530.78β€”
6claude-fable-51530.52β€”
7gpt-5.5-high1529.98β€”
8gemini-3-pro1528.90β€”
9gemini-3.1-pro-preview1526.94β€”
10qwen3.7-max-preview1525.85β€”
11claude-opus-4-7-thinking1520.19β€”
12glm-51518.41β€”
13qwen3.7-plus1517.90β€”
14gpt-5.4-high1517.32β€”
15claude-opus-4-71517.28β€”
16gemini-3.5-flash1516.06β€”
17gemini-3-flash1513.75β€”
18gpt-5.41513.46β€”
19claude-opus-4-81512.24β€”
20glm-5.2 (max)1511.93β€”
21glm-5.11511.42β€”
22gemini-2.5-pro1506.12β€”
23ernie-5.0-preview-10221505.57β€”
24muse-spark1504.98β€”
25qwen3.5-397b-a17b1504.83β€”
26mimo-v2.5-pro1503.97β€”
27Claude Opus 4.71395.00β€”
28GPT-51382.00β€”
29Claude Sonnet 4.61378.00β€”
30Gemini 2.5 Pro1370.00β€”
31GPT-4o (2024-11-20)1340.00β€”
32DeepSeek-V31330.00β€”
33Claude Haiku 4.51310.00β€”
34Qwen 2.5 Max1290.00β€”
35Claude 3.5 Sonnet1280.00β€”
36Llama 4 405B Instruct1280.00β€”
37Gemini 1.5 Pro (2024-02)1275.00β€”
38Llama 3 70B Instruct1208.00link
39Mixtral 8x22B Instruct1148.00β€”
40Chatbot Arena (introducing)0.00β€”
Chatbot Arena (LMSYS) chatbot-arena Leaderboard