Cross-modal Retrieval: A Systematic Review Of Methods And Future Directions
2023 Β· Tianshi Wang, Fengling Li, Lei Zhu, et al.
Abstract
With the exponential surge in diverse multi-modal data, traditional uni-modal retrieval methods struggle to meet the needs of users seeking access to data across various modalities. To address this, cross-modal retrieval has emerged, enabling interaction across modalities, facilitating semantic matching, and leveraging complementarity and consistency between heterogeneous data. Although prior literature has reviewed the field of cross-modal retrieval, it suffers from numerous deficiencies in terms of timeliness, taxonomy, and comprehensiveness. This paper conducts a comprehensive review of cross-modal retrieval's evolution, spanning from shallow statistical analysis techniques to vision-language pre-training models. Commencing with a comprehensive taxonomy grounded in machine learning paradigms, mechanisms, and models, the paper delves deeply into the principles and architectures underpinning existing cross-modal retrieval methods. Furthermore, it offers an overview of widely-used benc
Authors
(none)
Tags
Stats
Related papers
- Composed Multi-modal Retrieval: A Survey Of Approaches And Applications (2025)3.88
- Cross-modal Coordination Across A Diverse Set Of Input Modalities (2024)0.00
- Evaluating Perspectival Biases In Cross-modal Retrieval (2025)0.00
- Revisiting Cross Modal Retrieval (2018)0.00
- Continual Learning In Cross-modal Retrieval (2021)9.41
- Deep Learning Techniques For Future Intelligent Cross-media Retrieval (2020)0.00
- A Comprehensive Empirical Study Of Vision-language Pre-trained Model For Supervised Cross-modal Retrieval (2022)0.00
- MUST: An Effective And Scalable Framework For Multimodal Search Of Target Modality (2023)7.81