Do Cross Modal Systems Leverage Semantic Relationships?
2019 Β· Shah Nawaz, Muhammad Kamran Janjua, Ignazio Gallo, et al.
Abstract
Current cross-modal retrieval systems are evaluated using R@K measure which does not leverage semantic relationships rather strictly follows the manually marked image text query pairs. Therefore, current systems do not generalize well for the unseen data in the wild. To handle this, we propose a new measure, SemanticMap, to evaluate the performance of cross-modal systems. Our proposed measure evaluates the semantic similarity between the image and text representations in the latent embedding space. We also propose a novel cross-modal retrieval system using a single stream network for bidirectional retrieval. The proposed system is based on a deep neural network trained using extended center loss, minimizing the distance of image and text descriptions in the latent space from the class centers. In our system, the text descriptions are also encoded as images which enabled us to use a single stream network for both text and images. To the best of our knowledge, our work is the first of it
Authors
(none)
Tags
Stats
Related papers
- Preserving Semantic Neighborhoods For Robust Cross-modal Retrieval (2020)10.07
- Revisiting Cross Modal Retrieval (2018)0.00
- Is Cross-modal Information Retrieval Possible Without Training? (2023)0.00
- Do Neural Network Cross-modal Mappings Really Bridge Modalities? (2018)4.52
- Adversarial Cross-modal Retrieval Via Learning And Transferring Single-modal Similarities (2019)8.60
- Cross-modal Semantic Enhanced Interaction For Image-sentence Retrieval (2022)12.33
- Semcore: A Semantic-enhanced Generative Cross-modal Retrieval Framework With Mllms (2025)0.00
- Discriminative Semantic Transitive Consistency For Cross-modal Learning (2021)0.00