HOMIE: Histopathology Omni-modal Embedding For Pathology Composed Retrieval
2025 Β· Qifeng Zhou, Wenliang Zhong, Thao M. Dang, et al.
Abstract
The integration of Artificial Intelligence (AI) into pathology faces a fundamental challenge: black-box predictive models lack transparency, while generative approaches risk clinical hallucination. A case-based retrieval paradigm offers a more interpretable alternative for clinical adoption. However, current SOTA models are constrained by dual-encoder architectures that cannot process the composed modality of real-world clinical queries. We formally define the task of Pathology Composed Retrieval (PCR). However, progress in this newly defined task is blocked by two critical challenges: (1) Multimodal Large Language Models (MLLMs) offer the necessary deep-fusion architecture but suffer from a critical Task Mismatch and Domain Mismatch. (2) No benchmark exists to evaluate such compositional queries. To solve these challenges, we propose HOMIE, a systematic framework that transforms a general MLLM into a specialized retrieval expert. HOMIE resolves the dual mismatch via a two-stage proces
Authors
(none)
Tags
Stats
Related papers
- Multimodal Learning For Scalable Representation Of High-dimensional Medical Data (2024)0.00
- Accurate And Scalable Multimodal Pathology Retrieval Via Attentive Vision-language Alignment (2025)2.26
- Generative Vector Search To Improve Pathology Foundation Models Across Multimodal Vision-language Tasks (2025)0.00
- Multimodal Whole Slide Foundation Model For Pathology (2024)12.99
- HMAR: Hierarchical Modality-aware Expert And Dynamic Routing Medical Image Retrieval Architecture (2026)0.00
- Pathalign: A Vision-language Model For Whole Slide Images In Histopathology (2024)0.00
- On The Importance Of Text Preprocessing For Multimodal Representation Learning And Pathology Report Generation (2025)0.00
- CREM: Compression-driven Representation Enhancement For Multimodal Retrieval And Comprehension (2026)0.00