Learning To Retrieve: How To Train A Dense Retrieval Model Effectively And Efficiently
2020 Β· Jingtao Zhan, Jiaxin Mao, Yiqun Liu, et al.
Abstract
Ranking has always been one of the top concerns in information retrieval research. For decades, lexical matching signal has dominated the ad-hoc retrieval process, but it also has inherent defects, such as the vocabulary mismatch problem. Recently, Dense Retrieval (DR) technique has been proposed to alleviate these limitations by capturing the deep semantic relationship between queries and documents. The training of most existing Dense Retrieval models relies on sampling negative instances from the corpus to optimize a pairwise loss function. Through investigation, we find that this kind of training strategy is biased and fails to optimize full retrieval performance effectively and efficiently. To solve this problem, we propose a Learning To Retrieve (LTRe) training technique. LTRe constructs the document index beforehand. At each training iteration, it performs full retrieval without negative sampling and then updates the query representation model parameters. Through this process, it
Authors
(none)
Tags
Stats
Related papers
- Optimizing Dense Retrieval Model Training With Hard Negatives (2021)16.34
- Efficiently Teaching An Effective Dense Retriever With Balanced Topic Aware Sampling (2021)17.07
- Dense Text Retrieval Based On Pretrained Language Models: A Survey (2022)15.95
- Disentangled Modeling Of Domain And Relevance For Adaptable Dense Retrieval (2022)0.00
- Curriculum Learning For Dense Retrieval Distillation (2022)11.49
- Approximate Nearest Neighbor Negative Contrastive Learning For Dense Text Retrieval (2020)0.00
- Enhancing The Ranking Context Of Dense Retrieval Methods Through Reciprocal Nearest Neighbors (2023)4.52
- How To Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval (2023)11.39