Towards Competitive Search Relevance For Inference-free Learned Sparse Retrievers
2024 Β· Zhichao Geng, Yiwen Wang, Dongyu Ru, et al.
Abstract
Learned sparse retrieval, which can efficiently perform retrieval through mature inverted-index engines, has garnered growing attention in recent years. Particularly, the inference-free sparse retrievers are attractive as they eliminate online model inference in the retrieval phase thereby avoids huge computational cost, offering reasonable throughput and latency. However, even the state-of-the-art (SOTA) inference-free sparse models lag far behind in terms of search relevance when compared to both sparse and dense siamese models. Towards competitive search relevance for inference-free sparse retrievers, we argue that they deserve dedicated training methods other than using same ones with siamese encoders. In this paper, we propose two different approaches for performance improvement. First, we propose an IDF-aware penalty for the matching function that suppresses the contribution of low-IDF tokens and increases the model's focus on informative terms. Moreover, we propose a heterogeneo
Authors
(none)
Tags
Stats
Related papers
- Exploring \(\ell_0\) Sparsification For Inference-free Sparse Retrievers (2025)4.52
- Efficient Inverted Indexes For Approximate Retrieval Over Learned Sparse Representations (2024)11.67
- Enhancing The Ranking Context Of Dense Retrieval Methods Through Reciprocal Nearest Neighbors (2023)4.52
- Predicting Efficiency/effectiveness Trade-offs For Dense Vs. Sparse Retrieval Strategy Selection (2021)11.29
- Deeperimpact: Optimizing Sparse Learned Index Structures (2024)0.00
- Early Stage Sparse Retrieval With Entity Linking (2022)6.77
- Operational Advice For Dense And Sparse Retrievers: HNSW, Flat, Or Inverted Indexes? (2024)0.00
- End-to-end Retrieval With Learned Dense And Sparse Representations Using Lucene (2023)0.00