VBench T2V (Semantic Score) vbench-t2v-semantic Leaderboard
VBench text-to-video Semantic Score β how faithfully a generated video matches the prompt's content (object class, multiple objects, human action, color, spatial relationship, scene, style, consistency). The prompt-alignment axis complementing raw visual quality. Β· Metric: Semantic Score (higher is better)
| # | Model | Semantic Score | Paper |
|---|---|---|---|
| 1 | IPOW | 90.01 | link |
| 2 | Vidu Q1 (2025-04-17) | 87.94 | link |
| 3 | IPOC (2025-04-14) | 84.84 | link |
| 4 | Wan2.1 (2025-02-24) | 84.44 | link |
| 5 | JT3.5 | 84.19 | link |
| 6 | Luma | 84.17 | link |
| 7 | IPOC | 84.09 | link |
| 8 | LanDiff | 82.72 | link |
| 9 | Veo 3 | 82.49 | link |
| 10 | Wan2.1 | 80.95 | link |
| 11 | Open-Sora-2.0 | 80.14 | link |
| 12 | Open-Sora-2.0 (2025-03-18) | 80.12 | link |
| 13 | Wan2.1-T2V-1.3B | 80.09 | link |
| 14 | CogVideoX1.5-5B (5s SAT prompt-optimized) | 79.76 | link |
| 15 | Wan2.2-T2V-A14B (Qwen prompt-optimized) | 79.50 | link |
| 16 | MiracleVision V5 | 79.43 | link |
| 17 | Sora | 79.35 | link |
| 18 | CogVideoX1.5-5B | 79.17 | link |
| 19 | RepVideo | 78.91 | link |
| 20 | CausVid(2025-01-02 5s) | 78.75 | link |
| 21 | CausVid | 78.57 | link |
| 22 | AccVideo | 78.06 | link |
| 23 | CogVideoX-2B (Diffusers) | 77.81 | link |
| 24 | Vchitect-2.0-2B | 77.79 | link |
| 25 | MiniMax-Video-01 | 77.65 | link |
| 26 | STIV (Apple) | 77.57 | link |
| 27 | CogVideoX-5B (Diffusers) | 77.33 | link |
| 28 | Vchitect-2.0 (VEnhancer) | 77.06 | link |
| 29 | CogVideoX-5B (SAT prompt-optimized) | 77.04 | link |
| 30 | EasyAnimateV5.1 | 77.01 | link |