Llms As Sparse Retrievers:a Framework For First-stage Product Search
2025 Β· Hongru Song, Yu-An Liu, Ruqing Zhang, et al.
Abstract
Product search is a crucial component of modern e-commerce platforms, with billions of user queries every day. In product search systems, first-stage retrieval should achieve high recall while ensuring efficient online deployment. Sparse retrieval is particularly attractive in this context due to its interpretability and storage efficiency. However, sparse retrieval methods suffer from severe vocabulary mismatch issues, leading to suboptimal performance in product search scenarios. With their potential for semantic analysis, large language models (LLMs) offer a promising avenue for mitigating vocabulary mismatch issues and thereby improving retrieval quality. Directly applying LLMs to sparse retrieval in product search exposes two key challenges:(1)Queries and product titles are typically short and highly susceptible to LLM-induced hallucinations, such as generating irrelevant expansion terms or underweighting critical literal terms like brand names and model numbers;(2)The large vocab
Authors
(none)
Tags
Stats
Related papers
- Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm (2025)0.00
- Multimodal Semantic Retrieval For Product Search (2025)3.58
- Embedding-based Product Retrieval In Taobao Search (2021)13.70
- Lightretriever: A Llm-based Text Retrieval Architecture With Extremely Faster Query Inference (2025)0.00
- CSPLADE: Learned Sparse Retrieval With Causal Language Models (2025)0.00
- Learning Retrieval Models With Sparse Autoencoders (2026)0.00
- A Comparative Study Of Specialized Llms As Dense Retrievers (2025)2.26
- V\(^2\)L: Leveraging Vision And Vision-language Models Into Large-scale Product Retrieval (2022)0.00