MathVista mathvista Leaderboard
Mathematical reasoning in visual contexts β diagrams, charts, geometry and figure-based math problems. Tests whether a vision-language model can read a figure and reason quantitatively. Overall accuracy as scored by the OpenVLM Leaderboard. Β· Metric: Accuracy (higher is better)
| # | Model | Accuracy | Paper |
|---|---|---|---|
| 1 | SenseNova-V6-5-Pro | 82.80 | link |
| 2 | BlueLM-2.6-3B | 82.30 | link |
| 3 | GPT-5-20250807 | 81.90 | link |
| 4 | Gemini-2.5-Pro | 80.90 | link |
| 5 | Kimi-VL-A3B-Thinking-2506 | 79.50 | link |
| 6 | GPT-5-mini-20250807 | 79.20 | link |
| 7 | InternVL3-78B | 79.00 | link |
| 8 | SenseNova | 78.40 | link |
| 9 | R-4B | 78.00 | link |
| 10 | MiMo-VL-7B | 77.40 | link |
| 11 | SenseNova-V6-Pro | 76.90 | link |
| 12 | CongRong-v2.0 | 76.80 | link |
| 13 | BlueLM-2.5-3B | 76.70 | link |
| 14 | InternVL2.5-78B-MPO | 76.60 | link |
| 15 | InternVL3-38B | 76.30 | link |
| 16 | Ovis2-34B | 76.10 | link |
| 17 | TeleMM | 75.70 | link |
| 18 | MUG-U-7B | 74.80 | link |
| 19 | Step-1o | 74.70 | link |
| 20 | BailingMM-Lite-1203 | 74.50 | link |
| 21 | InternVL3-14B | 74.40 | link |
| 22 | Qwen2.5-VL-72B | 74.20 | link |
| 23 | SAIL-VL-1.6-8B | 74.20 | link |
| 24 | Ovis2-16B | 73.70 | link |
| 25 | InternVL2.5-38B-MPO | 73.60 | link |
| 26 | GLM-4v-Plus-20250111 | 73.50 | link |
| 27 | SAIL-VL-1.5-8B | 73.40 | link |
| 28 | MiniCPM-o-2.6 | 73.30 | link |
| 29 | VARCO-VISION-2.0-14B | 73.20 | link |
| 30 | GPT-5-nano-20250807 | 73.10 | link |