Composed Object Retrieval: Object-level Retrieval Via Composed Expressions
2025 Β· Tong Wang, Guanyu Yang, Nian Liu, et al.
Abstract
Retrieving fine-grained visual content based on user intent remains a challenge in multi-modal systems. Although current Composed Image Retrieval (CIR) methods combine reference images with retrieval texts, they are constrained to image-level matching and cannot localize specific objects. To this end, we propose Composed Object Retrieval (COR), a brand-new task that goes beyond image-level retrieval to achieve object-level precision, allowing the retrieval and segmentation of target objects based on composed expressions combining reference objects and retrieval texts. COR presents significant challenges in retrieval flexibility, which requires systems to identify arbitrary objects satisfying composed expressions while avoiding semantically similar but irrelevant negative objects within the same scene. We construct COR127K, the first large-scale COR benchmark that contains 127,166 retrieval triplets with various semantic transformations in 408 categories. We also present CORE, a unified
Authors
(none)
Tags
Stats
Related papers
- Beyond Semantic Search: Towards Referential Anchoring In Composed Image Retrieval (2026)0.00
- COLA: A Benchmark For Compositional Text-to-image Retrieval (2023)0.00
- Infocir: Multimedia Analysis For Composed Image Retrieval (2026)1.24
- Instance-level Composed Image Retrieval (2025)0.00
- HINT: Composed Image Retrieval With Dual-path Compositional Contextualized Network (2026)0.78
- A Sanity Check On Composed Image Retrieval (2026)0.00
- From Mapping To Composing: A Two-stage Framework For Zero-shot Composed Image Retrieval (2025)0.00
- CSMCIR: Cot-enhanced Symmetric Alignment With Memory Bank For Composed Image Retrieval (2026)0.00