Accurate And Scalable Multimodal Pathology Retrieval Via Attentive Vision-language Alignment
2025 Β· Hongyi Wang, Zhengjie Zhu, Jiabo Ma, et al.
Abstract
The rapid digitization of histopathology slides has opened up new possibilities for computational tools in clinical and research workflows. Among these, content-based slide retrieval stands out, enabling pathologists to identify morphologically and semantically similar cases, thereby supporting precise diagnoses, enhancing consistency across observers, and assisting example-based education. However, effective retrieval of whole slide images (WSIs) remains challenging due to their gigapixel scale and the difficulty of capturing subtle semantic differences amid abundant irrelevant content. To overcome these challenges, we present PathSearch, a retrieval framework that unifies fine-grained attentive mosaic representations with global-wise slide embeddings aligned through vision-language contrastive learning. Trained on a corpus of 6,926 slide-report pairs, PathSearch captures both fine-grained morphological cues and high-level semantic patterns to enable accurate and flexible retrieval. T
Authors
(none)
Tags
Stats
Related papers
- Pathalign: A Vision-language Model For Whole Slide Images In Histopathology (2024)0.00
- Multimodal Whole Slide Foundation Model For Pathology (2024)12.99
- On The Importance Of Text Preprocessing For Multimodal Representation Learning And Pathology Report Generation (2025)0.00
- Lifelong Histopathology Whole Slide Image Retrieval Via Distance Consistency Rehearsal (2024)3.58
- HOMIE: Histopathology Omni-modal Embedding For Pathology Composed Retrieval (2025)0.00
- Self-supervised Similarity Learning For Digital Pathology (2019)0.00
- Zero-shot Whole Slide Image Retrieval In Histopathology Using Embeddings Of Foundation Models (2024)0.00
- Yottixel -- An Image Search Engine For Large Archives Of Histopathology Whole Slide Images (2019)15.51