Worst-case Performance Of Popular Approximate Nearest Neighbor Search Implementations: Guarantees And Limitations
2023 Β· Piotr Indyk, Haike Xu
Abstract
Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees. We study the worst-case performance of recent graph-based approximate nearest neighbor search algorithms, such as HNSW, NSG and DiskANN. For DiskANN, we show that its "slow preprocessing" version provably supports approximate nearest neighbor search query with constant approximation ratio and poly-logarithmic query time, on data sets with bounded "intrinsic" dimension. For the other data structure variants studied, including DiskANN with "fast preprocessing", HNSW and NSG, we present a family of instances on which the empirical query time required to achieve a "reasonable" accuracy is linear in instance size. For example, for DiskANN, we show that the query procedure can take at least \(0.1 n\) steps on instances of size \(n\) before it encounters any of the \(5\) nearest neighbors of the query.
Authors
(none)
Tags
Stats
Related papers
- A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph (2023)0.00
- A Comprehensive Survey And Experimental Comparison Of Graph-based Approximate Nearest Neighbor Search (2021)17.35
- Graph-based Nearest Neighbor Search: From Practice To Theory (2019)0.00
- Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph (2017)18.20
- Fast-convergent Proximity Graphs For Approximate Nearest Neighbor Search (2025)0.00
- A Scalable Solution To The Nearest Neighbor Search Problem Through Local-search Methods On Neighbor Graphs (2017)3.58
- Sparse Neighborhood Graph-based Approximate Nearest Neighbor Search Revisited: Theoretical Analysis And Optimization (2025)0.00
- Understanding And Generalizing Monotonic Proximity Graphs For Approximate Nearest Neighbor Search (2021)0.00