Semantic Models For The First-stage Retrieval: A Comprehensive Review
2021 Β· Jiafeng Guo, Yinqiong Cai, Yixing Fan, et al.
Abstract
Multi-stage ranking pipelines have been a practical solution in modern search systems, where the first-stage retrieval is to return a subset of candidate documents, and latter stages attempt to re-rank those candidates. Unlike re-ranking stages going through quick technique shifts during past decades, the first-stage retrieval has long been dominated by classical term-based models. Unfortunately, these models suffer from the vocabulary mismatch problem, which may block re-ranking stages from relevant documents at the very beginning. Therefore, it has been a long-term desire to build semantic models for the first-stage retrieval that can achieve high recall efficiently. Recently, we have witnessed an explosive growth of research interests on the first-stage semantic retrieval models. We believe it is the right time to survey current status, learn from existing methods, and gain some insights for future development. In this paper, we describe the current landscape of the first-stage retr
Authors
(none)
Tags
Stats
Related papers
- L^2R: Lifelong Learning For First-stage Retrieval With Backward-compatible Representations (2023)5.24
- Efficient And Effective Tail Latency Minimization In Multi-stage Retrieval Systems (2017)11.76
- Optimizing Compound Retrieval Systems (2025)0.00
- CAME: Competitively Learning A Mixture-of-experts Model For First-stage Retrieval (2023)6.34
- Beyond Precision: A Study On Recall Of Initial Retrieval With Neural Representations (2018)4.52
- Neural Ranking Models For Document Retrieval (2021)11.08
- A Deep Look Into Neural Ranking Models For Information Retrieval (2019)17.73
- Cross-modal Retrieval: A Systematic Review Of Methods And Future Directions (2023)12.81