← all datasets

BEIR

Emerging

25papers using it

268HF downloads

10HF likes

2024first seen

Dataset Card for BEIR Benchmark Dataset Summary BEIR is a heterogeneous benchmark that has been built from 18 diverse datasets representing 9 information retrieval tasks: Fact-checking: FEVER, Climate-FEVER, SciFact Question-Answering: NQ, HotpotQA, FiQA-2018 Bio-Medical IR: TREC-COVID, BioASQ, NFCorpus News Retrieval:

🤗 Hugging Face⚖ cc-by-sa-4.0

Papers using BEIR (25)

ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking2025

Enhancing Lexicon-Based Text Embeddings with Large Language Models2025 · 1 cites

A Systematic Study of Pseudo-Relevance Feedback with LLMs2026

RankSteer: Activation Steering for Pointwise LLM Ranking2026

LLM-Confidence Reranker: A Training-Free Approach for Enhancing Retrieval-Augmented Generation Systems2026

LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever2026

Making Large Language Models Efficient Dense Retrievers2025

Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking2025

Scalable In-context Ranking with Generative Models2025

Doc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion2025

ERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text Reranking2025

How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models2025

DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation2025

Retrieval Capabilities of Large Language Models Scale with Pretraining FLOPs2025

Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information2025

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval2025

AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking2025

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval2025

Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation2025

Pseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval Models2025

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models2025

Scaling Sparse and Dense Retrieval in Decoder-Only LLMs2025

Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width2025

JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking2024 · 3 cites

Self-Calibrated Listwise Reranking with Large Language Models2024

BEIR — datasets — llm-papers