Efficient Inverted Indexes For Approximate Retrieval Over Learned Sparse Representations
2024 Β· Sebastian Bruch, Franco Maria Nardini, Cosimo Rulli, et al.
Abstract
Learned sparse representations form an attractive class of contextual embeddings for text retrieval. That is so because they are effective models of relevance and are interpretable by design. Despite their apparent compatibility with inverted indexes, however, retrieval over sparse embeddings remains challenging. That is due to the distributional differences between learned embeddings and term frequency-based lexical models of relevance such as BM25. Recognizing this challenge, a great deal of research has gone into, among other things, designing retrieval algorithms tailored to the properties of learned sparse representations, including approximate retrieval systems. In fact, this task featured prominently in the latest BigANN Challenge at NeurIPS 2023, where approximate algorithms were evaluated on a large benchmark dataset by throughput and recall. In this work, we propose a novel organization of the inverted index that enables fast yet effective approximate retrieval over learned s
Authors
(none)
Tags
Stats
Related papers
- Pairing Clustered Inverted Indexes With Knn Graphs For Fast Approximate Retrieval Over Learned Sparse Representations (2024)7.50
- Towards Competitive Search Relevance For Inference-free Learned Sparse Retrievers (2024)0.00
- SLIM: Sparsified Late Interaction For Multi-vector Retrieval With Inverted Indexes (2023)7.50
- Faster Learned Sparse Retrieval With Guided Traversal (2022)11.29
- Investigating The Scalability Of Approximate Sparse Retrieval Algorithms To Massive Datasets (2025)5.84
- Efficient And Effective Retrieval Of Dense-sparse Hybrid Vectors Using Graph-based Approximate Nearest Neighbor Search (2024)0.00
- Hybrid Inverted Index Is A Robust Accelerator For Dense Retrieval (2022)7.07
- Ultra-high Dimensional Sparse Representations With Binarization For Efficient Text Retrieval (2021)8.60