#ModelSuccess RatePaper
1Pointer Agent w/ Opus 4.783.64link
2Holo3-35B-A3B82.56link
3Pointer Agent w/ Sonnet 4.681.45link
4OpenAPA w/ gemini-3.1-pro78.34link
5VLAA-GUI w/ Opus 4.576.26link
6MiniMax M375.19link
7HIPPO Agent w/ Opus 4.574.48link
8Qwen 3.7 Plus73.30link
9Kimi K2.673.06link
10agent s3 w/ Opus 4.5 + GPT-5 bBoN (N=10)72.58link
11claude-sonnet-4-672.11link
12agent s3 w/ GPT-5 bBoN (N=10)69.90link
13agent s3 w/ Opus 4.5 bBoN (N=1)67.46link
14UiPath Screen Agent w/ Opus 4.567.14link
15OS-Symphony w/ GPT-565.77link
16agent s3 w/ GPT-5 bBoN (N=1)65.58link
17GBOX Agent64.22link
18GTA1 w/ GPT-563.41link
19Kimi K2.563.30link
20claude-sonnet-4-5-2025092962.88link
21Agentic-Lybic-Maestro61.93link
22Seed-1.861.87link
23CoACT-160.76link
24aworldGUIAgent-v158.04link
25EvoCUA-2026010556.73link
26agent s2.5 w/ o356.00link
27GUI-Owl-1.5 32B55.44link
28DeepMiner-Mano-72B53.91link
29UiPath Screen Agent w/ GPT-553.63link
30GTA1 w/ o353.10link
OSWorld (Verified) osworld Leaderboard