Enhancing The Ranking Context Of Dense Retrieval Methods Through Reciprocal Nearest Neighbors
2023 Β· George Zerveas, Navid Rekabsaz, Carsten Eickhoff
Abstract
Sparse annotation poses persistent challenges to training dense retrieval models; for example, it distorts the training signal when unlabeled relevant documents are used spuriously as negatives in contrastive learning. To alleviate this problem, we introduce evidence-based label smoothing, a novel, computationally efficient method that prevents penalizing the model for assigning high relevance to false negatives. To compute the target relevance distribution over candidate documents within the ranking context of a given query, we assign a non-zero relevance probability to those candidates most similar to the ground truth based on the degree of their similarity to the ground-truth document(s). To estimate relevance we leverage an improved similarity metric based on reciprocal nearest neighbors, which can also be used independently to rerank candidates in post-processing. Through extensive experiments on two large-scale ad hoc text retrieval datasets, we demonstrate that reciprocal near
Authors
(none)
Tags
Stats
Related papers
- Learning To Retrieve: How To Train A Dense Retrieval Model Effectively And Efficiently (2020)0.00
- Bixse: Improving Dense Retrieval Via Probabilistic Graded Relevance Distillation (2025)0.00
- CODER: An Efficient Framework For Improving Retrieval Through Contextual Document Embedding Reranking (2021)7.16
- Optimizing Dense Retrieval Model Training With Hard Negatives (2021)16.34
- Approximate Nearest Neighbor Negative Contrastive Learning For Dense Text Retrieval (2020)0.00
- Pseudo-relevance Feedback For Multiple Representation Dense Retrieval (2021)12.93
- Pairdistill: Pairwise Relevance Distillation For Dense Retrieval (2024)7.24
- Learning More From Less: Towards Strengthening Weak Supervision For Ad-hoc Retrieval (2019)5.84