Ms-shift: An Analysis Of MS MARCO Distribution Shifts On Neural Retrieval
2022 · Simon Lupart, Thibault Formal, Stéphane Clinchant
Abstract
Pre-trained Language Models have recently emerged in Information Retrieval as providing the backbone of a new generation of neural systems that outperform traditional methods on a variety of tasks. However, it is still unclear to what extent such approaches generalize in zero-shot conditions. The recent BEIR benchmark provides partial answers to this question by comparing models on datasets and tasks that differ from the training conditions. We aim to address the same question by comparing models under more explicit distribution shifts. To this end, we build three query-based distribution shifts within MS MARCO (query-semantic, query-intent, query-length), which are used to evaluate the three main families of neural retrievers based on BERT: sparse, dense, and late-interaction -- as well as a monoBERT re-ranker. We further analyse the performance drops between the train and test query distributions. In particular, we experiment with two generalization indicators: the first one based on
Authors
(none)
Tags
Stats
Related papers
- Transfer Learning Approaches For Building Cross-language Dense Retrieval Models (2022)10.97
- The Tale Of Two MS MARCO -- And Their Unfair Comparisons (2023)6.34
- Scaling Sparse And Dense Retrieval In Decoder-only Llms (2025)6.34
- A Comparative Study Of Specialized Llms As Dense Retrievers (2025)2.26
- How Train-test Leakage Affects Zero-shot Retrieval (2022)3.58
- Boosting Zero-shot Cross-lingual Retrieval By Training On Artificially Code-switched Data (2023)4.52
- Out-of-domain Semantics To The Rescue! Zero-shot Hybrid Retrieval Models (2022)10.07
- How Different Are Pre-trained Transformers For Text Ranking? (2022)7.81