A Representation Sharpening Framework For Zero Shot Dense Retrieval
2025 Β· Dhananjay Ashok, Suraj Nair, Mutasem Al-Darabsah, et al.
Abstract
Zero-shot dense retrieval is a challenging setting where a document corpus is provided without relevant queries, necessitating a reliance on pretrained dense retrievers (DRs). However, since these DRs are not trained on the target corpus, they struggle to represent semantic differences between similar documents. To address this failing, we introduce a training-free representation sharpening framework that augments a document's representation with information that helps differentiate it from similar documents in the corpus. On over twenty datasets spanning multiple languages, the representation sharpening framework proves consistently superior to traditional retrieval, setting a new state-of-the-art on the BRIGHT benchmark. We show that representation sharpening is compatible with prior approaches to zero-shot dense retrieval and consistently improves their performance. Finally, we address the performance-cost tradeoff presented by our framework and devise an indexing-time approximation
Authors
(none)
Tags
Stats
Related papers
- Selecting Which Dense Retriever To Use For Zero-shot Search (2023)6.34
- Injecting Domain Adaptation With Learning-to-hash For Effective And Efficient Zero-shot Dense Retrieval (2022)2.80
- Precise Zero-shot Dense Retrieval Without Relevance Labels (2022)17.27
- A Distributed Collaborative Retrieval Framework Excelling In All Queries And Corpora Based On Zero-shot Rank-oriented Automatic Evaluation (2024)0.00
- Laprador: Unsupervised Pretrained Dense Retriever For Zero-shot Text Retrieval (2022)8.82
- Learning Discrete Representations Via Constrained Clustering For Effective And Efficient Dense Retrieval (2021)11.39
- Zero-shot Dense Retrieval With Momentum Adversarial Domain Invariant Representations (2021)6.77
- COCO-DR: Combating Distribution Shifts In Zero-shot Dense Retrieval With Contrastive And Distributionally Robust Learning (2022)12.85