Grounded Multimodal Retrieval-augmented Drafting Of Radiology Impressions Using Case-based Similarity Search
2026 Β· Himadri Samanta
Abstract
Automated radiology report generation has gained increasing attention with the rise of deep learning and large language models. However, fully generative approaches often suffer from hallucinations and lack clinical grounding, limiting their reliability in real-world workflows. In this study, we propose a multimodal retrieval-augmented generation (RAG) system for grounded drafting of chest radiograph impressions. The system combines contrastive image-text embeddings, case-based similarity retrieval, and citation-constrained draft generation to ensure factual alignment with historical radiology reports. A curated subset of the MIMIC-CXR dataset was used to construct a multimodal retrieval database. Image embeddings were generated using CLIP encoders, while textual embeddings were derived from structured impression sections. A fusion similarity framework was implemented using FAISS indexing for scalable nearest-neighbor retrieval. Retrieved cases were used to construct grounded prompts f
Authors
(none)
Tags
Stats
Related papers
- Beyond The Embedding Bottleneck: Adaptive Retrieval-augmented 3D CT Report Generation (2026)0.00
- Radir: A Scalable Framework For Multi-grained Medical Image Retrieval Via Radiology Report Mining (2025)0.00
- Multimodal Image-text Matching Improves Retrieval-based Chest X-ray Report Generation (2023)3.33
- Unsupervised Multimodal Representation Learning Across Medical Images And Reports (2018)0.00
- Learning Visual-semantic Embeddings For Reporting Abnormal Findings On Chest X-rays (2020)9.76
- Ontology-based Concept Distillation For Radiology Report Retrieval And Labeling (2025)1.20
- Prototype-enhanced Confidence Modeling For Cross-modal Medical Image-report Retrieval (2025)0.00
- X-TRA: Improving Chest X-ray Tasks With Cross-modal Retrieval Augmentation (2023)8.09