Contextual Similarity Aggregation With Self-attention For Visual Re-ranking
2021 Β· Jianbo Ouyang, Hui Wu, Min Wang, et al.
Abstract
In content-based image retrieval, the first-round retrieval result by simple visual feature comparison may be unsatisfactory, which can be refined by visual re-ranking techniques. In image retrieval, it is observed that the contextual similarity among the top-ranked images is an important clue to distinguish the semantic relevance. Inspired by this observation, in this paper, we propose a visual re-ranking method by contextual similarity aggregation with self-attention. In our approach, for each image in the top-K ranking list, we represent it into an affinity feature vector by comparing it with a set of anchor images. Then, the affinity features of the top-K images are refined by aggregating the contextual information with a transformer encoder. Finally, the affinity features are used to recalculate the similarity scores between the query and the top-K images for re-ranking of the latter. To further improve the robustness of our re-ranking model and enhance the performance of our meth
Authors
(none)
Tags
Stats
Related papers
- Visual Re-ranking With Non-visual Side Information (2025)0.00
- Moving Towards Centers: Re-ranking With Attention And Memory For Re-identification (2021)8.09
- Graph Convolution Based Efficient Re-ranking For Visual Retrieval (2023)9.92
- Attribute-aware Deep Hashing With Self-consistency For Large-scale Fine-grained Image Retrieval (2023)11.76
- STIR: Siamese Transformer For Image Retrieval Postprocessing (2023)11.23
- Learnable Pillar-based Re-ranking For Image-text Retrieval (2023)9.92
- Contextual Visual Similarity (2016)0.00
- Attention Grounded Enhancement For Visual Document Retrieval (2025)0.00