ARTEMIS: Attention-based Retrieval With Text-explicit Matching And Implicit Similarity
2022 Β· Ginger Delmas, Rafael Sampaio de Rezende, Gabriela Csurka, et al.
Abstract
An intuitive way to search for images is to use queries composed of an example image and a complementary text. While the first provides rich and implicit context for the search, the latter explicitly calls for new traits, or specifies how some elements of the example image should be changed to retrieve the desired target image. Current approaches typically combine the features of each of the two elements of the query into a single representation, which can then be compared to the ones of the potential target images. Our work aims at shedding new light on the task by looking at it through the prism of two familiar and related frameworks: text-to-image and image-to-image retrieval. Taking inspiration from them, we exploit the specific relation of each query element with the targeted image and derive light-weight attention mechanisms which enable to mediate between the two complementary modalities. We validate our approach on several retrieval benchmarks, querying with images and their as
Authors
(none)
Tags
Stats
Related papers
- IMRAM: Iterative Matching With Recurrent Attention Memory For Cross-modal Image-text Retrieval (2020)19.22
- Composing Text And Image For Image Retrieval - An Empirical Odyssey (2018)18.71
- Embedding Arithmetic Of Multimodal Queries For Image Retrieval (2021)9.03
- Efficient Image-text Retrieval Via Keyword-guided Pre-screening (2023)5.84
- AMES: Asymmetric And Memory-efficient Similarity Estimation For Instance-level Retrieval (2024)9.70
- SAC: Semantic Attention Composition For Text-conditioned Image Retrieval (2020)11.49
- Modality-agnostic Attention Fusion For Visual Search With Text Feedback (2020)0.00
- Image Search With Text Feedback By Additive Attention Compositional Learning (2022)0.00