← all datasets

MVPBench

Emerging
10papers using it
2025first seen

MVPBench is a curated benchmark designed to evaluate visual physical reasoning in multimodal large language models through interleaved multi-image inputs that require coherent, step-by-step reasoning paths.

Papers using MVPBench (10)

MVPBench β€” datasets β€” multimodal