SORCE: Small Object Retrieval In Complex Environments
2025 Β· Chunxu Liu, Chi Xie, Xiaxu Chen, et al.
Abstract
Text-to-Image Retrieval (T2IR) is a highly valuable task that aims to match a given textual query to images in a gallery. Existing benchmarks primarily focus on textual queries describing overall image semantics or foreground salient objects, possibly overlooking inconspicuous small objects, especially in complex environments. Such small object retrieval is crucial, as in real-world applications, the targets of interest are not always prominent in the image. Thus, we introduce SORCE (Small Object Retrieval in Complex Environments), a new subfield of T2IR, focusing on retrieving small objects in complex images with textual queries. We propose a new benchmark, SORCE-1K, consisting of images with complex environments and textual queries describing less conspicuous small objects with minimal contextual cues from other salient objects. Preliminary analysis on SORCE-1K finds that existing T2IR methods struggle to capture small objects and encode all the semantics into a single embedding, lea
Authors
(none)
Tags
Stats
Related papers
- Composed Object Retrieval: Object-level Retrieval Via Composed Expressions (2025)1.91
- Find Your Needle: Small Object Image Retrieval Via Multi-object Attention Optimization (2025)0.00
- Referring Expression Instance Retrieval And A Strong End-to-end Baseline (2025)0.00
- CFIR: Fast And Effective Long-text To Image Retrieval For Large Corpora (2024)7.16
- Isearle: Improving Textual Inversion For Zero-shot Composed Image Retrieval (2024)12.09
- Zero-shot Composed Image Retrieval With Textual Inversion (2023)19.84
- SHREC 2025: Retrieval Of Optimal Objects For Multi-modal Enhanced Language And Spatial Assistance (ROOMELSA) (2025)3.58
- Object-centric Open-vocabulary Image-retrieval With Aggregated Features (2023)0.00