Domain-aware RAG: Mol-enhanced RL For Efficient Training And Scalable Retrieval
2025 Β· Hao Lin, Peitong Xie, Jingxue Chen, et al.
Abstract
Retrieval-Augmented Generation (RAG) systems rely heavily on the retrieval stage, particularly the coarse-ranking process. Existing coarse-ranking optimization approaches often struggle to balance domain-specific knowledge learning with query enhencement, resulting in suboptimal retrieval performance. To address this challenge, we propose MoLER, a domain-aware RAG method that uses MoL-Enhanced Reinforcement Learning to optimize retrieval. MoLER has a two-stage pipeline: a continual pre-training (CPT) phase using a Mixture of Losses (MoL) to balance domain-specific knowledge with general language capabilities, and a reinforcement learning (RL) phase leveraging Group Relative Policy Optimization (GRPO) to optimize query and passage generation for maximizing document recall. A key innovation is our Multi-query Single-passage Late Fusion (MSLF) strategy, which reduces computational overhead during RL training while maintaining scalable inference via Multi-query Multi-passage Late Fusion (M
Authors
(none)
Tags
Stats
Related papers
- Optimizing Retrieval For RAG Via Reinforcement Learning (2025)0.00
- MLLM Is A Strong Reranker: Advancing Multimodal Retrieval-augmented Generation Via Knowledge-enhanced Reranking And Noise-injected Training (2024)9.18
- Funnelrag: A Coarse-to-fine Progressive Retrieval Paradigm For RAG (2024)3.58
- Multi-head RAG: Solving Multi-aspect Problems With Llms (2024)0.00
- Re-ranking The Context For Multimodal Retrieval Augmented Generation (2025)0.00
- Hetarag: Hybrid Deep Retrieval-augmented Generation Across Heterogeneous Data Stores (2025)3.27
- Universalrag: Retrieval-augmented Generation Over Corpora Of Diverse Modalities And Granularities (2025)0.00
- LMAR: Language Model Augmented Retriever For Domain-specific Knowledge Indexing (2025)1.57