AMC: Attention Guided Multi-modal Correlation Learning For Image Search
2017 Β· Kan Chen, Trung Bui, Fang Chen, et al.
Abstract
Given a user's query, traditional image search systems rank images according to its relevance to a single modality (e.g., image content or surrounding text). Nowadays, an increasing number of images on the Internet are available with associated meta data in rich modalities (e.g., titles, keywords, tags, etc.), which can be exploited for better similarity measure with queries. In this paper, we leverage visual and textual modalities for image search by learning their correlation with input query. According to the intent of query, attention mechanism can be introduced to adaptively balance the importance of different modalities. We propose a novel Attention guided Multi-modal Correlation (AMC) learning method which consists of a jointly learned hierarchy of intra and inter-attention networks. Conditioned on query's intent, intra-attention networks (i.e., visual intra-attention network and language intra-attention network) attend on informative parts within each modality; a multi-modal in
Authors
(none)
Tags
Stats
Related papers
- Mcot-mvs: Multi-level Vision Selection By Multi-modal Chain-of-thought Reasoning For Composed Image Retrieval (2026)0.00
- MVAM: Multi-view Attention Method For Fine-grained Image-text Matching (2024)0.00
- Bringing Multimodality To Amazon Visual Search System (2024)6.34
- Exploring A Fine-grained Multiscale Method For Cross-modal Remote Sensing Image Retrieval (2022)16.73
- Image Search With Text Feedback By Additive Attention Compositional Learning (2022)0.00
- CSMCIR: Cot-enhanced Symmetric Alignment With Memory Bank For Composed Image Retrieval (2026)0.00
- Multimodal Learned Sparse Retrieval For Image Suggestion (2024)0.00
- Cross-modal Semantic Enhanced Interaction For Image-sentence Retrieval (2022)12.33