R2MED: A Benchmark For Reasoning-driven Medical Retrieval
2025 Β· Xiangxu Zhang, Lei Li, Xiao Zhou, et al.
Abstract
Current medical retrieval benchmarks primarily emphasize lexical or shallow semantic similarity, overlooking the reasoning-intensive demands that are central to clinical decision-making. In practice, physicians often retrieve authoritative medical evidence to support diagnostic hypotheses. Such evidence typically aligns with an inferred diagnosis rather than the surface form of a patient's symptoms, leading to low lexical or semantic overlap between queries and relevant documents. To address this gap, we introduce R2MED, the first benchmark explicitly designed for reasoning-driven medical retrieval. It comprises 876 queries spanning three tasks: Q&A reference retrieval, clinical evidence retrieval, and clinical case retrieval. These tasks are drawn from five representative medical scenarios and twelve body systems, capturing the complexity and diversity of real-world medical information needs. We evaluate 15 widely-used retrieval systems on R2MED and find that even the best model achie
Authors
(none)
Tags
Stats
Related papers
- MRMR: A Realistic And Expert-level Multidisciplinary Benchmark For Reasoning-intensive Multimodal Retrieval (2025)0.00
- M3retrieve: Benchmarking Multimodal Retrieval For Medicine (2025)2.16
- Mr\(^2\)-bench: Going Beyond Matching To Reasoning In Multimodal Retrieval (2025)1.81
- Rar-b: Reasoning As Retrieval Benchmark (2024)2.68
- MM-BRIGHT: A Multi-task Multimodal Benchmark For Reasoning-intensive Retrieval (2026)2.60
- A Systematic Study Of Retrieval Pipeline Design For Retrieval-augmented Medical Question Answering (2026)0.00
- Pmc-patients: A Large-scale Dataset Of Patient Summaries And Relations For Benchmarking Retrieval-based Clinical Decision Support Systems (2022)11.39
- Radir: A Scalable Framework For Multi-grained Medical Image Retrieval Via Radiology Report Mining (2025)0.00