Sparse Navigable Graphs For Nearest Neighbor Search: Algorithms And Hardness | Awesome Similarity Search Papers

Sparse Navigable Graphs For Nearest Neighbor Search: Algorithms And Hardness

Sanjeev Khanna, Ashwin Padaki, Erik Waingarten · Proceedings of the 2026 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) · 2025

We initiate the study of approximation algorithms and computational barriers for constructing sparse (\alpha)-navigable graphs [IX23, DGM+24], a core primitive underlying recent advances in graph-based nearest neighbor search. Given an (n)-point dataset (P) with an associated metric (\mathsf{d}) and a parameter (\alpha \geq 1), the goal is to efficiently build the sparsest graph (G=(P, E)) that is (\alpha)-navigable: for every distinct (s, t \in P), there exists an edge ((s, u) \in E) with (\mathsf{d}(u, t) < \mathsf{d}(s, t)/\alpha). We consider two natural sparsity objectives: minimizing the maximum out-degree and minimizing the total size. We first show a strong negative result: the slow-preprocessing version of DiskANN (analyzed in [IX23] for low-doubling metrics) can yield solutions whose sparsity is (\widetilde{Ω}(n)) times larger than optimal, even on Euclidean instances. We then show a tight approximation-preserving equivalence between the Sparsest Navigable Graph problem and the classic Set Cover problem, obtaining an (O(n^3))-time ((\ln n + 1))-approximation algorithm, as well as establishing NP-hardness of achieving an (o(\ln n))-approximation. Building on this equivalence, we develop faster (O(\ln n))-approximation algorithms. The first runs in (\widetilde{O}(n \cdot \mathrm{OPT})) time and is thus much faster when the optimal solution is sparse. The second, based on fast matrix multiplication, is a bicriteria algorithm that computes an (O(\ln n))-approximation to the sparsest (2\alpha)-navigable graph, running in (\widetilde{O}(n^{\omega})) time. Finally, we complement our upper bounds with a query complexity lower bound, showing that any (o(n))-approximation requires examining (Ω(n^2)) distances. This result shows that in the regime where (\mathrm{OPT} = \widetilde{O}(n)), our (\widetilde{O}(n \cdot \mathrm{OPT}))-time algorithm is essentially best possible.

Explore more on:
Uncategorized
Similar Work
Loading…