A New Hashing Based Nearest Neighbors Selection Technique For Big Datasets
2020 Β· Jude Tchaye-Kondi, Yanlong Zhai, Liehuang Zhu
Abstract
KNN has the reputation to be the word simplest but efficient supervised learning algorithm used for either classification or regression. KNN prediction efficiency highly depends on the size of its training data but when this training data grows KNN suffers from slowness in making decisions since it needs to search nearest neighbors within the entire dataset at each decision making. This paper proposes a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given observation. The proposed approach consists of dividing the data space into subcells of a virtual grid built on top of data space. The mapping between the data points and subcells is performed using hashing. When it comes to select the nearest neighbors of a given observation, we firstly identify the cell the observation belongs by using hashing, and then we look for nearest neighbors from that central cell and cells around it layer by layer. From our experiment performance analysis on
Authors
(none)
Tags
Stats
Related papers
- A Survey On Deep Hashing Methods (2020)16.84
- Minimax Rate Optimal Adaptive Nearest Neighbor Classification And Regression (2019)8.35
- Fast And Bayes-consistent Nearest Neighbors (2019)0.00
- Efficient Data-aware Distance Comparison Operations For High-dimensional Approximate Nearest Neighbor Search (2024)5.24
- A Scalable Solution To The Nearest Neighbor Search Problem Through Local-search Methods On Neighbor Graphs (2017)3.58
- A Revisit Of Hashing Algorithms For Approximate Nearest Neighbor Search (2016)11.19
- Approximate Knn Classification For Biomedical Data (2020)8.35
- Approximate Nearest Neighbour Search On Dynamic Datasets: An Investigation (2024)0.00