Multimodal Whole Slide Foundation Model For Pathology
2024 Β· Tong Ding, Sophia J. Wagner, Andrew H. Song, et al.
Abstract
The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests (ROIs) into versatile and transferable feature representations via self-supervised learning (SSL). However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data in disease-specific cohorts, especially for rare clinical conditions. We propose TITAN, a multimodal whole slide foundation model pretrained using 335,645 WSIs via visual self-supervised learning and vision-language alignment with corresponding pathology reports and 423,122 synthetic captions generated from a multimodal generative AI copilot for pathology. Without any finetuning or requiring clinical labels, TITAN can extract general-purpose slide representations and generate pathology reports that generalize to resource-limited clinical scenarios such as rare disease retrieval and cancer pro
Authors
(none)
Tags
Stats
Related papers
- Pathalign: A Vision-language Model For Whole Slide Images In Histopathology (2024)0.00
- Accurate And Scalable Multimodal Pathology Retrieval Via Attentive Vision-language Alignment (2025)2.26
- On The Importance Of Text Preprocessing For Multimodal Representation Learning And Pathology Report Generation (2025)0.00
- Zero-shot Whole Slide Image Retrieval In Histopathology Using Embeddings Of Foundation Models (2024)0.00
- Self-supervised Similarity Learning For Digital Pathology (2019)0.00
- HOMIE: Histopathology Omni-modal Embedding For Pathology Composed Retrieval (2025)0.00
- Multimodal Learning For Scalable Representation Of High-dimensional Medical Data (2024)0.00
- Yottixel -- An Image Search Engine For Large Archives Of Histopathology Whole Slide Images (2019)15.51