Approximate Nearest Neighbour Search On Dynamic Datasets: An Investigation
2024 · Ben Harwood, Amir Dezfouli, Iadine Chades, et al.
Abstract
Approximate k-Nearest Neighbour (ANN) methods are often used for mining information and aiding machine learning on large scale high-dimensional datasets. ANN methods typically differ in the index structure used for accelerating searches, resulting in various recall/runtime trade-off points. For applications with static datasets, runtime constraints and dataset properties can be used to empirically select an ANN method with suitable operating characteristics. However, for applications with dynamic datasets, which are subject to frequent online changes (like addition of new samples), there is currently no consensus as to which ANN methods are most suitable. Traditional evaluation approaches do not consider the computational costs of updating the index structure, as well as the rate and size of index updates. To address this, we empirically evaluate 5 popular ANN methods on two main applications (online data collection and online feature learning) while taking into account these considera
Authors
(none)
Tags
Stats
Related papers
- EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based On Knn Graph (2016)0.00
- Ann-benchmarks: A Benchmarking Tool For Approximate Nearest Neighbor Algorithms (2018)14.73
- A Comprehensive Survey And Experimental Comparison Of Graph-based Approximate Nearest Neighbor Search (2021)17.35
- A Memory-efficient Distributed Algorithm For Approximate Nearest Neighbour Search With Arbitrary Distances (2024)0.00
- Lorann: Low-rank Matrix Factorization For Approximate Nearest Neighbor Search (2024)2.26
- A Revisit Of Hashing Algorithms For Approximate Nearest Neighbor Search (2016)11.19
- Kannolo: Sweet And Smooth Approximate K-nearest Neighbors Search (2025)8.74
- Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph (2017)18.20