Intrec: Intent-based Retrieval With Contrastive Refinement
2026 Β· Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, et al.
Abstract
Retrieving user-specified objects from complex scenes remains a challenging task, especially when queries are ambiguous or involve multiple similar objects. Existing open-vocabulary detectors operate in a one-shot manner, lacking the ability to refine predictions based on user feedback. To address this, we propose IntRec, an interactive object retrieval framework that refines predictions based on user feedback. At its core is an Intent State (IS) that maintains dual memory sets for positive anchors (confirmed cues) and negative constraints (rejected hypotheses). A contrastive alignment function ranks candidate objects by maximizing similarity to positive cues while penalizing rejected ones, enabling fine-grained disambiguation in cluttered scenes. Our interactive framework provides substantial improvements in retrieval accuracy without additional supervision. On LVIS, IntRec achieves 35.4 AP, outperforming OVMR, CoDet, and CAKE by +2.3, +3.7, and +0.5, respectively. On the challenging
Authors
(none)
Tags
Stats
Related papers
- Referring Expression Instance Retrieval And A Strong End-to-end Baseline (2025)0.00
- IDMR: Towards Instance-driven Precise Visual Correspondence In Multimodal Retrieval (2025)2.29
- Revisiting Human-in-the-loop Object Retrieval With Pre-trained Vision Transformers (2026)0.00
- Attention Grounded Enhancement For Visual Document Retrieval (2025)0.00
- Contextrefine-clip For EPIC-KITCHENS-100 Multi-instance Retrieval Challenge 2025 (2025)0.95
- Infocir: Multimedia Analysis For Composed Image Retrieval (2026)1.24
- Unsupervised Dense Retrieval With Relevance-aware Contrastive Pre-training (2023)10.44
- Ask&confirm: Active Detail Enriching For Cross-modal Retrieval With Partial Query (2021)11.68