Retrieval Augmented Classification For Long-tail Visual Recognition
2022 Β· Alexander Long, Wei Yin, Thalaiyasingam Ajanthan, et al.
Abstract
We introduce Retrieval Augmented Classification (RAC), a generic approach to augmenting standard image classification pipelines with an explicit retrieval module. RAC consists of a standard base image encoder fused with a parallel retrieval branch that queries a non-parametric external memory of pre-encoded images and associated text snippets. We apply RAC to the problem of long-tail classification and demonstrate a significant improvement over previous state-of-the-art on Places365-LT and iNaturalist-2018 (14.5% and 6.7% respectively), despite using only the training datasets themselves as the external information source. We demonstrate that RAC's retrieval module, without prompting, learns a high level of accuracy on tail classes. This, in turn, frees the base encoder to focus on common classes, and improve its performance thereon. RAC represents an alternative approach to utilizing large, pretrained models without requiring fine-tuning, as well as a first step towards more effective
Authors
(none)
Tags
Stats
Related papers
- Improving Image Recognition By Retrieving From Web-scale Image-text Data (2023)9.41
- Cross-modal Retrieval Augmentation For Multi-modal Classification (2021)9.23
- Retrieval-augmented Perception: High-resolution Image Perception Meets Visual RAG (2025)0.00
- Towards Retrieval-augmented Architectures For Image Captioning (2024)9.41
- Retrieval Augmentation For Deep Neural Networks (2021)5.84
- RAVEN: Multitask Retrieval Augmented Vision-language Learning (2024)0.00
- RAVID: Retrieval-augmented Visual Detection: A Knowledge-driven Approach For Ai-generated Image Identification (2025)0.00
- An Accurate Retrieval Through R-MAC+ Descriptors For Landmark Recognition (2018)8.82