BEIR
Emerging25papers using it
268HF downloads
10HF likes
2024first seen
Dataset Card for BEIR Benchmark Dataset Summary BEIR is a heterogeneous benchmark that has been built from 18 diverse datasets representing 9 information retrieval tasks: Fact-checking: FEVER, Climate-FEVER, SciFact Question-Answering: NQ, HotpotQA, FiQA-2018 Bio-Medical IR: TREC-COVID, BioASQ, NFCorpus News Retrieval:
π€ Hugging Faceβ cc-by-sa-4.0
Papers using BEIR (25)
- ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models RerankingEnhancing Lexicon-Based Text Embeddings with Large Language ModelsA Systematic Study of Pseudo-Relevance Feedback with LLMsRankSteer: Activation Steering for Pointwise LLM RankingLLM-Confidence Reranker: A Training-Free Approach for Enhancing Retrieval-Augmented Generation SystemsLLM2IR: simple unsupervised contrastive learning makes long-context LLM great retrieverMaking Large Language Models Efficient Dense RetrieversContextual Relevance and Adaptive Sampling for LLM-Based Document RerankingScalable In-context Ranking with Generative ModelsDoc2Query++: Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index FusionERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text RerankingHow Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking ModelsDeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM DistillationRetrieval Capabilities of Large Language Models Scale with Pretraining FLOPsPrecise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context InformationFrom Token to Action: State Machine Reasoning to Mitigate Overthinking
in Information RetrievalAcuRank: Uncertainty-Aware Adaptive Computation for Listwise RerankingFrom Token to Action: State Machine Reasoning to Mitigate Overthinking in Information RetrievalUtility-Focused LLM Annotation for Retrieval and Retrieval-Augmented GenerationPseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval ModelsRankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language ModelsScaling Sparse and Dense Retrieval in Decoder-Only LLMsMatryoshka Re-Ranker: A Flexible Re-Ranking Architecture With
Configurable Depth and WidthJudgeRank: Leveraging Large Language Models for Reasoning-Intensive
RerankingSelf-Calibrated Listwise Reranking with Large Language Models