PRISM: Product Retrieval In Shopping Carts Using Hybrid Matching
2025 Β· Arda Kabadayi, Senem Velipasalar, Jiajing Chen
Abstract
Compared to traditional image retrieval tasks, product retrieval in retail settings is even more challenging. Products of the same type from different brands may have highly similar visual appearances, and the query image may be taken from an angle that differs significantly from view angles of the stored catalog images. Foundational models, such as CLIP and SigLIP, often struggle to distinguish these subtle but important local differences. Pixel-wise matching methods, on the other hand, are computationally expensive and incur prohibitively high matching times. In this paper, we propose a new, hybrid method, called PRISM, for product retrieval in retail settings by leveraging the advantages of both vision-language model-based and pixel-wise matching approaches. To provide both efficiency/speed and finegrained retrieval accuracy, PRISM consists of three stages: 1) A vision-language model (SigLIP) is employed first to retrieve the top 35 most semantically similar products from a fixed ga
Authors
(none)
Tags
Stats
Related papers
- SIR: Similar Image Retrieval For Product Search In E-commerce (2020)0.00
- Through The Prism: Importance-aware Scene Graphs For Image Retrieval (2025)0.00
- Visually Similar Products Retrieval For Shopsy (2022)2.26
- Visual Product Search Benchmark (2026)0.00
- Multimodal Semantic Retrieval For Product Search (2025)3.58
- Zero-shot Retrieval For Scalable Visual Search In A Two-sided Marketplace (2025)1.57
- Fashionmv: Product-level Composed Image Retrieval With Multi-view Fashion Data (2026)2.98
- Hierarchical Similarity Learning For Language-based Product Image Retrieval (2021)6.93