Towards Cross-modal Text-molecule Retrieval With Better Modality Alignment
2024 Β· Jia Song, Wanru Zhuang, Yujie Lin, et al.
Abstract
Cross-modal text-molecule retrieval model aims to learn a shared feature space of the text and molecule modalities for accurate similarity calculation, which facilitates the rapid screening of molecules with specific properties and activities in drug design. However, previous works have two main defects. First, they are inadequate in capturing modality-shared features considering the significant gap between text sequences and molecule graphs. Second, they mainly rely on contrastive learning and adversarial training for cross-modality alignment, both of which mainly focus on the first-order similarity, ignoring the second-order similarity that can capture more structural information in the embedding space. To address these issues, we propose a novel cross-modal text-molecule retrieval model with two-fold improvements. Specifically, on the top of two modality-specific encoders, we stack a memory bank based feature projector that contain learnable memory vectors to extract modality-shared
Authors
(none)
Tags
Stats
Related papers
- Multimodal Representation Alignment For Cross-modal Information Retrieval (2025)0.00
- Adversarial Cross-modal Retrieval Via Learning And Transferring Single-modal Similarities (2019)8.60
- Thin Bridges For Drug Text Alignment: Lightweight Contrastive Learning For Target Specific Drug Retrieval (2025)0.00
- CL2CM: Improving Cross-lingual Cross-modal Retrieval Via Cross-lingual Knowledge Transfer (2023)8.60
- Textme: Bridging Unseen Modalities Through Text Descriptions (2026)0.00
- Look, Imagine And Match: Improving Textual-visual Cross-modal Retrieval With Generative Models (2017)18.52
- Structured Multi-modal Feature Embedding And Alignment For Image-sentence Retrieval (2021)12.87
- Cross-modal Coherence For Text-to-image Retrieval (2021)6.77