On Approximate Nearest Neighbour Selection For Multi-stage Dense Retrieval
2021 Β· Craig MacDonald, Nicola Tonellotto
Abstract
Dense retrieval, which describes the use of contextualised language models such as BERT to identify documents from a collection by leveraging approximate nearest neighbour (ANN) techniques, has been increasing in popularity. Two families of approaches have emerged, depending on whether documents and queries are represented by single or multiple embeddings. ColBERT, the exemplar of the latter, uses an ANN index and approximate scores to identify a set of candidate documents for each query embedding, which are then re-ranked using accurate document representations. In this manner, a large number of documents can be retrieved for each query, hindering the efficiency of the approach. In this work, we investigate the use of ANN scores for ranking the candidate documents, in order to decrease the number of candidate documents being fully scored. Experiments conducted on the MSMARCO passage ranking corpus demonstrate that, by cutting of the candidate set by using the approximate scores to onl
Authors
(none)
Tags
Stats
Related papers
- Pseudo-relevance Feedback For Multiple Representation Dense Retrieval (2021)12.93
- Enhancing The Ranking Context Of Dense Retrieval Methods Through Reciprocal Nearest Neighbors (2023)4.52
- Approximate Nearest Neighbor Negative Contrastive Learning For Dense Text Retrieval (2020)0.00
- Investigating Multi-layer Representations For Dense Passage Retrieval (2025)0.00
- Efficient And Effective Retrieval Of Dense-sparse Hybrid Vectors Using Graph-based Approximate Nearest Neighbor Search (2024)0.00
- Colbert: Efficient And Effective Passage Search Via Contextualized Late Interaction Over BERT (2020)0.00
- Improving Query Representations For Dense Retrieval With Pseudo Relevance Feedback (2021)12.10
- Noise-robust Dense Retrieval Via Contrastive Alignment Post Training (2023)0.00