Multi-vector Retrieval As Sparse Alignment
2022 Β· Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, et al.
Abstract
Multi-vector retrieval models improve over single-vector dual encoders on many information retrieval tasks. In this paper, we cast the multi-vector retrieval problem as sparse alignment between query and document tokens. We propose AligneR, a novel multi-vector retrieval model that learns sparsified pairwise alignments between query and document tokens (e.g. `dog' vs. `puppy') and per-token unary saliences reflecting their relative importance for retrieval. We show that controlling the sparsity of pairwise token alignments often brings significant performance gains. While most factoid questions focusing on a specific part of a document require a smaller number of alignments, others requiring a broader understanding of a document favor a larger number of alignments. Unary saliences, on the other hand, decide whether a token ever needs to be aligned with others for retrieval (e.g. `kind' from `kind of currency is used in new zealand\}'). With sparsified unary saliences, we are able to pr
Authors
(none)
Tags
Stats
Related papers
- Rethinking The Role Of Token Retrieval In Multi-vector Retrieval (2023)0.00
- Multimodal Representation Alignment For Cross-modal Information Retrieval (2025)0.00
- Realign: Optimizing The Visual Document Retriever With Reasoning-guided Fine-grained Alignment (2026)2.20
- Cross-modal Retrieval Augmentation For Multi-modal Classification (2021)9.23
- Investigating Multi-layer Representations For Dense Passage Retrieval (2025)0.00
- CITADEL: Conditional Token Interaction Via Dynamic Lexical Routing For Efficient And Effective Multi-vector Retrieval (2022)13.05
- Generative Retrieval As Multi-vector Dense Retrieval (2024)8.60
- Universal Vision-language Dense Retrieval: Learning A Unified Representation Space For Multi-modal Retrieval (2022)3.45