MRMR: A Realistic And Expert-level Multidisciplinary Benchmark For Reasoning-intensive Multimodal Retrieval
2025 Β· Siyue Zhang, Yuan Gao, Xiao Zhou, et al.
Abstract
We introduce MRMR, the first expert-level multidisciplinary multimodal retrieval benchmark requiring intensive reasoning. MRMR contains 1,502 queries spanning 23 domains, with positive documents carefully verified by human experts. Compared to prior benchmarks, MRMR introduces three key advancements. First, it challenges retrieval systems across diverse areas of expertise, enabling fine-grained model comparison across domains. Second, queries are reasoning-intensive, with images requiring deeper interpretation such as diagnosing microscopic slides. We further introduce Contradiction Retrieval, a novel task requiring models to identify conflicting concepts. Finally, queries and documents are constructed as image-text interleaved sequences. Unlike earlier benchmarks restricted to single images or unimodal documents, MRMR offers a realistic setting with multi-image queries and mixed-modality corpus documents. We conduct an extensive evaluation of 4 categories of multimodal retrieval syste
Authors
(none)
Tags
Stats
Related papers
- Mr\(^2\)-bench: Going Beyond Matching To Reasoning In Multimodal Retrieval (2025)1.81
- MM-BRIGHT: A Multi-task Multimodal Benchmark For Reasoning-intensive Retrieval (2026)2.60
- MARVEL: Multimodal Adaptive Reasoning-intensive Expand-rerank And Retrieval (2026)0.00
- Beyond Global Similarity: Towards Fine-grained, Multi-condition Multimodal Retrieval (2026)2.20
- R2MED: A Benchmark For Reasoning-driven Medical Retrieval (2025)2.51
- Multihaystack: Benchmarking Multimodal Retrieval And Reasoning Over 40K Images, Videos, And Documents (2026)0.00
- REAL-MM-RAG: A Real-world Multi-modal Retrieval Benchmark (2025)4.52
- M3retrieve: Benchmarking Multimodal Retrieval For Medicine (2025)2.16