Product1m: Towards Weakly Supervised Instance-level Product Retrieval Via Cross-modal Pretraining
2021 Β· Xunlin Zhan, Yangxin Wu, Xiao Dong, et al.
Abstract
Nowadays, customer's demands for E-commerce are more diversified, which introduces more complications to the product retrieval industry. Previous methods are either subject to single-modal input or perform supervised image-level product retrieval, thus fail to accommodate real-life scenarios where enormous weakly annotated multi-modal data are present. In this paper, we investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval among fine-grained product categories. To promote the study of this challenging task, we contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval. Notably, Product1M contains over 1 million image-caption pairs and consists of two sample types, i.e., single-product and multi-product samples, which encompass a wide variety of cosmetics brands. In addition to the great diversity, Product1M enjoys several appealing characteristics including fine-graine
Authors
(none)
Tags
Stats
Related papers
- Entity-graph Enhanced Cross-modal Pretraining For Instance-level Product Retrieval (2022)5.24
- Large-scale Product Retrieval With Weakly Supervised Representation Learning (2022)0.00
- Asr-enhanced Multimodal Representation Learning For Cross-domain Product Retrieval (2024)0.00
- Multimodal Semantic Retrieval For Product Search (2025)3.58
- MAKE: Vision-language Pre-training Based Product Retrieval In Taobao Search (2023)7.81
- Fashionmv: Product-level Composed Image Retrieval With Multi-view Fashion Data (2026)2.98
- ACE-BERT: Adversarial Cross-modal Enhanced BERT For E-commerce Retrieval (2021)0.00
- Commercemm: Large-scale Commerce Multimodal Representation Learning With Omni Retrieval (2022)0.00