Unifier: A Unified Retriever For Large-scale Retrieval
2022 Β· Tao Shen, Xiubo Geng, Chongyang Tao, et al.
Abstract
Large-scale retrieval is to recall relevant documents from a huge collection given a query. It relies on representation learning to embed documents and queries into a common semantic encoding space. According to the encoding space, recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms. These two paradigms unveil the PLMs' representation capability in different granularities, i.e., global sequence-level compression and local word-level contexts, respectively. Inspired by their complementary global-local contextualization and distinct representing views, we propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability. Experiments on passage retrieval benchmarks verify its effectiveness in both paradigms. A uni-retrieval scheme is further presented with even better retrieval quality. We lastly evaluate the model
Authors
(none)
Tags
Stats
Related papers
- Universal Vision-language Dense Retrieval: Learning A Unified Representation Space For Multi-modal Retrieval (2022)3.45
- Tevatron 2.0: Unified Document Retrieval Toolkit Across Scale, Language, And Modality (2025)3.58
- Unifying Latent And Lexicon Representations For Effective Video-text Retrieval (2024)0.00
- Uniir: Training And Benchmarking Universal Multimodal Information Retrievers (2023)10.48
- Lightretriever: A Llm-based Text Retrieval Architecture With Extremely Faster Query Inference (2025)0.00
- Uni-retriever: Towards Learning The Unified Embedding Based Retriever In Bing Sponsored Search (2022)9.92
- Unicom: Universal And Compact Representation Learning For Image Retrieval (2023)5.70
- Investigating Multi-layer Representations For Dense Passage Retrieval (2025)0.00