The Impacts Of Data, Ordering, And Intrinsic Dimensionality On Recall In Hierarchical Navigable Small Worlds
2024 Β· Owen Pendrigh Elliott, Jesse Clark
Abstract
Vector search systems, pivotal in AI applications, often rely on the Hierarchical Navigable Small Worlds (HNSW) algorithm. However, the behaviour of HNSW under real-world scenarios using vectors generated with deep learning models remains under-explored. Existing Approximate Nearest Neighbours (ANN) benchmarks and research typically has an over-reliance on simplistic datasets like MNIST or SIFT1M and fail to reflect the complexity of current use-cases. Our investigation focuses on HNSW's efficacy across a spectrum of datasets, including synthetic vectors tailored to mimic specific intrinsic dimensionalities, widely-used retrieval benchmarks with popular embedding models, and proprietary e-commerce image data with CLIP models. We survey the most popular HNSW vector databases and collate their default parameters to provide a realistic fixed parameterisation for the duration of the paper. We discover that the recall of approximate HNSW search, in comparison to exact K Nearest Neighbours
Authors
(none)
Tags
Stats
Related papers
- Down With The Hierarchy: The 'H' In HNSW Stands For "hubs" (2024)0.00
- Efficient And Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs (2016)22.99
- Dimensionality-reduction Techniques For Approximate Nearest Neighbor Search: A Survey And Evaluation (2024)0.00
- Breaking The Curse Of Dimensionality: On The Stability Of Modern Vector Retrieval (2025)0.00
- Enhancing HNSW Index For Real-time Updates: Addressing Unreachable Points And Performance Degradation (2024)1.56
- From HNSW To Information-theoretic Binarization: Rethinking The Architecture Of Scalable Vector Search (2025)0.00
- Practice With Graph-based ANN Algorithms On Sparse Data: Chi-square Two-tower Model, HNSW, Sign Cauchy Projections (2023)0.00
- Exploring The Meaningfulness Of Nearest Neighbor Search In High-dimensional Space (2024)2.26