Scaling Laws For Dense Retrieval
2024 Β· Yan Fang, Jingtao Zhan, Qingyao Ai, et al.
Abstract
Scaling up neural models has yielded significant advancements in a wide array of tasks, particularly in language generation. Previous studies have found that the performance of neural models frequently adheres to predictable scaling laws, correlated with factors such as training set size and model size. This insight is invaluable, especially as large-scale experiments grow increasingly resource-intensive. Yet, such scaling law has not been fully explored in dense retrieval due to the discrete nature of retrieval metrics and complex relationships between training data and model sizes in retrieval tasks. In this study, we investigate whether the performance of dense retrieval models follows the scaling law as other neural models. We propose to use contrastive log-likelihood as the evaluation metric and conduct extensive experiments with dense retrieval models implemented with different numbers of parameters and trained with different amounts of annotated data. Results indicate that, unde
Authors
(none)
Tags
Stats
Related papers
- Scalingnote: Scaling Up Retrievers With Large Language Models For Real-world Dense Retrieval (2024)0.00
- Scaling Sparse And Dense Retrieval In Decoder-only Llms (2025)6.34
- Scaling Laws For Embedding Dimension In Information Retrieval (2026)0.00
- Evaluating The Effectiveness And Scalability Of Llm-based Data Augmentation For Retrieval (2025)0.00
- Unsupervised Dense Information Retrieval With Contrastive Learning (2021)0.00
- Pseudo Relevance Feedback Is Enough To Close The Gap Between Small And Large Dense Retrieval Models (2025)0.00
- CSPLADE: Learned Sparse Retrieval With Causal Language Models (2025)0.00
- Unsupervised Dense Retrieval With Conterfactual Contrastive Learning (2024)0.00