MVPBench
Emerging10papers using it
2025first seen
MVPBench is a curated benchmark designed to evaluate visual physical reasoning in multimodal large language models through interleaved multi-image inputs that require coherent, step-by-step reasoning paths.
Papers using MVPBench (10)
- Pushupbench: Your VLM Is Not Good At Counting PushupsLongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference OptimizationClue Matters: Leveraging Latent Visual Clues to Empower Video ReasoningMACD: Model-Aware Contrastive Decoding via Counterfactual DataImproving Video Question Answering through query-based frame selectionVideo Evidence to Reasoning Efficient Video Understanding via Explicit Evidence GroundingSeeing Is Not Reasoning: Mvpbench For Graph-based Evaluation Of Multi-path Visual Physical CotEnhancing Temporal Understanding In Video-llms Through Stacked Temporal Attention In Vision EncodersRo-bench: Large-scale Robustness Evaluation Of Mllms With Text-driven Counterfactual VideosGam-agent: Game-theoretic And Uncertainty-aware Collaboration For Complex Visual Reasoning