Bixse: Improving Dense Retrieval Via Probabilistic Graded Relevance Distillation
2025 Β· Christos Tsirigotis, Vaibhav Adlakha, Joao Monteiro, et al.
Abstract
Neural sentence embedding models for dense retrieval typically rely on binary relevance labels, treating query-document pairs as either relevant or irrelevant. However, real-world relevance often exists on a continuum, and recent advances in large language models (LLMs) have made it feasible to scale the generation of fine-grained graded relevance labels. In this work, we propose BiXSE, a simple and effective pointwise training method that optimizes binary cross-entropy (BCE) over LLM-generated graded relevance scores. BiXSE interprets these scores as probabilistic targets, enabling granular supervision from a single labeled query-document pair per query. Unlike pairwise or listwise losses that require multiple annotated comparisons per query, BiXSE achieves strong performance with reduced annotation and compute costs by leveraging in-batch negatives. Extensive experiments across sentence embedding (MMTEB) and retrieval benchmarks (BEIR, TREC-DL) show that BiXSE consistently outperform
Authors
(none)
Tags
Stats
Related papers
- Enhancing The Ranking Context Of Dense Retrieval Methods Through Reciprocal Nearest Neighbors (2023)4.52
- The Overlooked Role Of Graded Relevance Thresholds In Multilingual Dense Retrieval (2026)0.00
- Pseudo-relevance Feedback For Multiple Representation Dense Retrieval (2021)12.93
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- Learning Effective Representations For Retrieval Using Self-distillation With Adaptive Relevance Margins (2024)2.26
- Lexsembridge: Fine-grained Dense Representation Enhancement Through Token-aware Embedding Augmentation (2025)2.35
- Domain Adaptation For Dense Retrieval Through Self-supervision By Pseudo-relevance Labeling (2022)0.00
- Expandr: Teaching Dense Retrievers Beyond Queries With LLM Guidance (2025)3.25