MA-DPR: Manifold-aware Distance Metrics For Dense Passage Retrieval
2025 Β· Yifan Liu, Qianfeng Wen, Mark Zhao, et al.
Abstract
Dense Passage Retrieval (DPR) typically relies on Euclidean or cosine distance to measure query-passage relevance in embedding space, which is effective when embeddings lie on a linear manifold. However, our experiments across DPR benchmarks suggest that embeddings often lie on lower-dimensional, non-linear manifolds, especially in out-of-distribution (OOD) settings, where cosine and Euclidean distance fail to capture semantic similarity. To address this limitation, we propose a manifold-aware distance metric for DPR (MA-DPR) that models the intrinsic manifold structure of passages using a nearest neighbor graph and measures query-passage distance based on their shortest path in this graph. We show that MA-DPR outperforms Euclidean and cosine distances by up to 26% on OOD passage retrieval with comparable in-distribution performance across various embedding models while incurring a minimal increase in query inference time. Empirical evidence suggests that manifold-aware distance allows
Authors
(none)
Tags
Stats
Related papers
- Improving Dense Passage Retrieval With Multiple Positive Passages (2025)0.00
- DAPR: A Benchmark On Document-aware Passage Retrieval (2023)5.18
- Dense Passage Retrieval: Is It Retrieving? (2024)6.34
- PARM: A Paragraph Aggregation Retrieval Model For Dense Document-to-document Retrieval (2022)8.35
- Decoding Dense Embeddings: Sparse Autoencoders For Interpreting And Discretizing Dense Retrieval (2025)0.00
- Cohort Retrieval Using Dense Passage Retrieval (2025)0.00
- LIDER: An Efficient High-dimensional Learned Index For Large-scale Dense Passage Retrieval (2022)0.00
- Piecewise-linear Manifolds For Deep Metric Learning (2024)0.00