Expandr: Teaching Dense Retrievers Beyond Queries With LLM Guidance
2025 Β· Sijia Yao, Pengcheng Huang, Zhenghao Liu, et al.
Abstract
Large language models (LLMs) have demonstrated significant potential in enhancing dense retrieval through query augmentation. However, most existing methods treat the LLM and the retriever as separate modules, overlooking the alignment between generation and ranking objectives. In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework that jointly optimizes both the LLM and the retriever. ExpandR employs the LLM to generate semantically rich query expansions, which are leveraged to enhance the retriever's training. Simultaneously, the LLM is trained using Direct Preference Optimization (DPO), guided by a carefully designed reward function that balances retrieval effectiveness and generation consistency. This joint optimization paradigm enables mutual adaptation between the LLM and the retriever, resulting in query expansions that are both informative and well-suited for retrieval. Experimental results on multiple benchmarks show that ExpandR consistently outpe
Authors
(none)
Tags
Stats
Related papers
- Scalingnote: Scaling Up Retrievers With Large Language Models For Real-world Dense Retrieval (2024)0.00
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- Making Large Language Models Efficient Dense Retrievers (2025)0.00
- Revela: Dense Retriever Learning Via Language Modeling (2025)0.00
- Evaluating The Effectiveness And Scalability Of Llm-based Data Augmentation For Retrieval (2025)0.00
- Lamra: Large Multimodal Model As Your Advanced Retrieval Assistant (2024)7.50
- LMAR: Language Model Augmented Retriever For Domain-specific Knowledge Indexing (2025)1.57
- Pseudo Relevance Feedback Is Enough To Close The Gap Between Small And Large Dense Retrieval Models (2025)0.00