SWE-bench Lite swe-bench-lite Leaderboard
SWE-bench Lite β a 300-issue subset of SWE-bench selected to be cheaper and faster to evaluate while preserving the original task format. Widely used as the entry-level board for software-engineering agents. Score is the % of issues resolved. Β· Metric: % Resolved (higher is better)
| # | Model | % Resolved | Paper |
|---|---|---|---|
| 1 | ExpeRepair-v1.0 + Claude 4 Sonnet | 60.33 | link |
| 2 | Refact.ai Agent | 60.00 | link |
| 3 | KGCompass + Claude 4 Sonnet (20250514) | 58.33 | link |
| 4 | SWE-agent + Claude 4 Sonnet | 56.67 | link |
| 5 | Isoform | 55.00 | link |
| 6 | SemAgent_Multi-v1.0 | 51.67 | link |
| 7 | Isea | 51.33 | β |
| 8 | Blackbox AI Agent | 49.00 | link |
| 9 | Codev | 49.00 | link |
| 10 | Gru(2024-12-08) | 48.67 | link |
| 11 | ExpeRepair-v1.0 | 48.33 | link |
| 12 | Globant Code Fixer Agent | 48.33 | link |
| 13 | SWE-agent + Claude 3.7 Sonnet | 48.00 | link |
| 14 | devlo | 47.33 | link |
| 15 | DARS Agent | 47.00 | link |
| 16 | KGCompass + Claude 3.5 Sonnet (20241022) | 46.00 | link |
| 17 | EntroPO + R2E + Qwen3-Coder-30B-A3B-Instruct | 45.00 | link |
| 18 | Kodu-v1 + Claude-3.5 Sonnet (20241022) | 44.67 | link |
| 19 | CodeFuse-CGM | 44.00 | link |
| 20 | CodeStory Aide + Mixed Models | 43.00 | link |
| 21 | Lingxi | 42.67 | link |
| 22 | Codart AI | 41.67 | link |
| 23 | OpenHands + CodeAct v2.1 (claude-3-5-sonnet-20241022) | 41.67 | link |
| 24 | PatchKitty-0.9 + Claude-3.5 Sonnet (20241022) | 41.33 | β |
| 25 | Composio SWE-Kit (2024-10-30) | 41.00 | link |
| 26 | OrcaLoca + Agentless-1.5 + Claude-3.5 Sonnet (20241022) | 41.00 | link |
| 27 | Agentless-1.5 + Claude-3.5 Sonnet (20241022) | 40.67 | link |
| 28 | OpenCSG Starship Agentic Coder + GPT 4 (0806) | 39.67 | link |
| 29 | Bytedance MarsCode Agent | 39.33 | link |