Delving Into E-commerce Product Retrieval With Vision-language Pre-training
2023 Β· Xiaoyang Zheng, Fuyu Lv, Zilong Wang, et al.
Abstract
E-commerce search engines comprise a retrieval phase and a ranking phase, where the first one returns a candidate product set given user queries. Recently, vision-language pre-training, combining textual information with visual clues, has been popular in the application of retrieval tasks. In this paper, we propose a novel V+L pre-training method to solve the retrieval problem in Taobao Search. We design a visual pre-training task based on contrastive learning, outperforming common regression-based visual pre-training tasks. In addition, we adopt two negative sampling schemes, tailored for the large-scale retrieval task. Besides, we introduce the details of the online deployment of our proposed method in real-world situations. Extensive offline/online experiments demonstrate the superior performance of our method on the retrieval task. Our proposed method is employed as one retrieval channel of Taobao Search and serves hundreds of millions of users in real time.
Authors
(none)
Tags
Stats
Related papers
- MAKE: Vision-language Pre-training Based Product Retrieval In Taobao Search (2023)7.81
- V\(^2\)L: Leveraging Vision And Vision-language Models Into Large-scale Product Retrieval (2022)0.00
- Zero-shot Retrieval For Scalable Visual Search In A Two-sided Marketplace (2025)1.57
- Unified Vision-language Representation Modeling For E-commerce Same-style Products Retrieval (2023)6.34
- Embedding-based Product Retrieval In Taobao Search (2021)13.70
- Retrieval-grpo: A Multi-objective Reinforcement Learning Framework For Dense Retrieval In Taobao Search (2025)0.00
- Visually Similar Products Retrieval For Shopsy (2022)2.26
- Multi-objective Personalized Product Retrieval In Taobao Search (2022)0.00