Resources For Brewing BEIR: Reproducible Reference Models And An Official Leaderboard
2023 Β· Ehsan Kamalloo, Nandan Thakur, Carlos Lassance, et al.
Abstract
BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across 18 different domain/task combinations. In recent years, we have witnessed the growing popularity of a representation learning approach to building retrieval models, typically using pretrained transformers in a supervised setting. This naturally begs the question: How effective are these models when presented with queries and documents that differ from the training data? Examples include searching in different domains (e.g., medical or legal text) and with different types of queries (e.g., keywords vs. well-formed questions). While BEIR was designed to answer these questions, our work addresses two shortcomings that prevent the benchmark from achieving its full potential: First, the sophistication of modern neural methods and the complexity of current software infrastructure create barriers to entry for newcomers. To this end, we provide reproducible reference implementations that cover the two m
Authors
(none)
Tags
Stats
Related papers
- BEIR: A Heterogenous Benchmark For Zero-shot Evaluation Of Information Retrieval Models (2021)6.67
- Hindi-beir : A Large Scale Retrieval Benchmark In Hindi (2024)0.00
- Systematic Evaluation Of Neural Retrieval Models On The Touch\'e 2020 Argument Retrieval Subset Of BEIR (2024)9.31
- MAIR: A Massive Benchmark For Evaluating Instructed Retrieval (2024)6.41
- Uniir: Training And Benchmarking Universal Multimodal Information Retrievers (2023)10.48
- Benchmarking And Building Zero-shot Hindi Retrieval Model With Hindi-beir And NLLB-E5 (2024)0.00
- Hard Negatives, Hard Lessons: Revisiting Training Data Quality For Robust Information Retrieval With Llms (2025)2.26
- Evaluating Embedding Apis For Information Retrieval (2023)8.09