Selecting Which Dense Retriever To Use For Zero-shot Search
2023 Β· Ekaterina Khramtsova, Shengyao Zhuang, Mahsa Baktashmotlagh, et al.
Abstract
We propose the new problem of choosing which dense retrieval model to use when searching on a new collection for which no labels are available, i.e. in a zero-shot setting. Many dense retrieval models are readily available. Each model however is characterized by very differing search effectiveness -- not just on the test portion of the datasets in which the dense representations have been learned but, importantly, also across different datasets for which data was not used to learn the dense representations. This is because dense retrievers typically require training on a large amount of labeled data to achieve satisfactory search effectiveness in a specific dataset or domain. Moreover, effectiveness gains obtained by dense retrievers on datasets for which they are able to observe labels during training, do not necessarily generalise to datasets that have not been observed during training. This is however a hard problem: through empirical experimentation we show that methods inspired by
Authors
(none)
Tags
Stats
Related papers
- A Representation Sharpening Framework For Zero Shot Dense Retrieval (2025)0.00
- Boot And Switch: Alternating Distillation For Zero-shot Dense Retrieval (2023)0.00
- Embark On Densequest: A System For Selecting The Best Dense Retriever For A Custom Collection (2024)2.26
- Optimizing Dense Retrieval Model Training With Hard Negatives (2021)16.34
- Injecting Domain Adaptation With Learning-to-hash For Effective And Efficient Zero-shot Dense Retrieval (2022)2.80
- Learning To Retrieve: How To Train A Dense Retrieval Model Effectively And Efficiently (2020)0.00
- Precise Zero-shot Dense Retrieval Without Relevance Labels (2022)17.27
- Laprador: Unsupervised Pretrained Dense Retriever For Zero-shot Text Retrieval (2022)8.82