Unifar: A Unified Facet-aware Retrieval Framework For Scientific Documents
2026 Β· Zheng Dou, Zhao Zhang, Deqing Wang, et al.
Abstract
Existing scientific document retrieval (SDR) methods primarily rely on document-centric representations learned from inter-document relationships for document-document (doc-doc) retrieval. However, the rise of LLMs and RAG has shifted SDR toward question-driven retrieval, where documents are retrieved in response to natural-language questions (q-doc). This change has led to systematic mismatches between document-centric models and question-driven retrieval, including (1) input granularity (long documents vs. short questions), (2) semantic focus (scientific discourse structure vs. specific question intent), and (3) training signals (citation-based similarity vs. question-oriented relevance). To this end, we propose UniFAR, a Unified Facet-Aware Retrieval framework to jointly support doc-doc and q-doc SDR within a single architecture. UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable face
Authors
(none)
Tags
Stats
Related papers
- Corank: Llm-based Compact Reranking With Document Features For Scientific Retrieval (2025)0.00
- Multi-facet Blending For Faceted Query-by-example Retrieval (2024)0.00
- Pairsem: Llm-guided Pairwise Semantic Matching For Scientific Document Retrieval (2025)0.00
- Unifier: A Unified Retriever For Large-scale Retrieval (2022)7.50
- Adapting Learned Sparse Retrieval For Long Documents (2023)5.24
- Simpledoc: Multi-modal Document Understanding With Dual-cue Page Retrieval And Iterative Refinement (2025)5.50
- Resolving The Robustness-precision Trade-off In Financial RAG Through Hybrid Document-routed Retrieval (2026)0.00
- Chain Of Retrieval: Multi-aspect Iterative Search Expansion And Post-order Search Aggregation For Full Paper Retrieval (2025)0.95