SAC: Semantic Attention Composition For Text-conditioned Image Retrieval
2020 Β· Surgan Jandial, Pinkesh Badjatiya, Pranit Chawla, et al.
Abstract
The ability to efficiently search for images is essential for improving the user experiences across various products. Incorporating user feedback, via multi-modal inputs, to navigate visual search can help tailor retrieved results to specific user queries. We focus on the task of text-conditioned image retrieval that utilizes support text feedback alongside a reference image to retrieve images that concurrently satisfy constraints imposed by both inputs. The task is challenging since it requires learning composite image-text features by incorporating multiple cross-granular semantic edits from text feedback and then applying the same to visual features. To address this, we propose a novel framework SAC which resolves the above in two major steps: "where to see" (Semantic Feature Attention) and "how to change" (Semantic Feature Modification). We systematically show how our architecture streamlines the generation of text-aware image features by removing the need for various modules requi
Authors
(none)
Tags
Stats
Related papers
- Image Search With Text Feedback By Additive Attention Compositional Learning (2022)0.00
- Beyond Semantic Search: Towards Referential Anchoring In Composed Image Retrieval (2026)0.00
- Modality-agnostic Attention Fusion For Visual Search With Text Feedback (2020)0.00
- Cross-modal Semantic Enhanced Interaction For Image-sentence Retrieval (2022)12.33
- BOSS: Bottom-up Cross-modal Semantic Composition With Hybrid Counterfactual Training For Robust Content-based Image Retrieval (2022)0.00
- Image-text Retrieval Via Preserving Main Semantics Of Vision (2023)10.22
- Composed Image Retrieval For Remote Sensing (2024)11.03
- Cala: Complementary Association Learning For Augmenting Composed Image Retrieval (2024)9.41