Multi-view Document Representation Learning For Open-domain Dense Retrieval
2022 Β· Shunyu Zhang, Yaobo Liang, Ming Gong, et al.
Abstract
Dense retrieval has achieved impressive advances in first-stage retrieval from a large-scale document collection, which is built on bi-encoder architecture to produce single vector representation of query and document. However, a document can usually answer multiple potential queries from different views. So the single vector representation of a document is hard to match with multi-view queries, and faces a semantic mismatch problem. This paper proposes a multi-view document representation learning framework, aiming to produce multi-view embeddings to represent documents and enforce them to align with different queries. First, we propose a simple yet effective method of generating multiple embeddings through viewers. Second, to prevent multi-view embeddings from collapsing to the same one, we further propose a global-local loss with annealed temperature to encourage the multiple viewers to better align with different potential queries. Experiments show our method outperforms recent wor
Authors
(none)
Tags
Stats
Related papers
- Learning Diverse Document Representations With Deep Query Interactions For Dense Retrieval (2022)2.51
- Investigating Multi-layer Representations For Dense Passage Retrieval (2025)0.00
- Improving Document Representations By Generating Pseudo Query Embeddings For Dense Retrieval (2021)9.41
- Universal Vision-language Dense Retrieval: Learning A Unified Representation Space For Multi-modal Retrieval (2022)3.45
- Learning From Multiview Correlations In Open-domain Videos (2018)5.84
- MURE: Hierarchical Multi-resolution Encoding Via Vision-language Models For Visual Document Retrieval (2026)0.00
- A Multi-resolution Word Embedding For Document Retrieval From Large Unstructured Knowledge Bases (2019)0.00
- Dual Encoding For Video Retrieval By Text (2020)16.05