Learning Regional Attention Over Multi-resolution Deep Convolutional Features For Trademark Retrieval
2021 Β· Osman Tursun, Simon Denman, Sridha Sridharan, et al.
Abstract
Large-scale trademark retrieval is an important content-based image retrieval task. A recent study shows that off-the-shelf deep features aggregated with Regional-Maximum Activation of Convolutions (R-MAC) achieve state-of-the-art results. However, R-MAC suffers in the presence of background clutter/trivial regions and scale variance, and discards important spatial information. We introduce three simple but effective modifications to R-MAC to overcome these drawbacks. First, we propose the use of both sum and max pooling to minimise the loss of spatial information. We also employ domain-specific unsupervised soft-attention to eliminate background clutter and unimportant regions. Finally, we add multi-resolution inputs to enhance the scale-invariance of R-MAC. We evaluate these three modifications on the million-scale METU dataset. Our results show that all modifications bring non-trivial improvements, and surpass previous state-of-the-art results.
Authors
(none)
Tags
Stats
Related papers
- An Accurate Retrieval Through R-MAC+ Descriptors For Landmark Recognition (2018)8.82
- A Large-scale Dataset And Benchmark For Similar Trademark Retrieval (2017)0.00
- DALG: Deep Attentive Local And Global Modeling For Image Retrieval (2022)0.00
- REMAP: Multi-layer Entropy-guided Pooling Of Dense CNN Features For Image Retrieval (2019)12.33
- Exploring A Fine-grained Multiscale Method For Cross-modal Remote Sensing Image Retrieval (2022)16.73
- End-to-end Learning Of Deep Visual Representations For Image Retrieval (2016)19.66
- Attention-aware Generalized Mean Pooling For Image Retrieval (2018)0.00
- Deep Image Retrieval: Learning Global Representations For Image Search (2016)19.67