DEANN: Speeding Up Kernel-density Estimation Using Approximate Nearest Neighbor Search
2021 · Matti Karppa, Martin Aumüller, Rasmus Pagh
Abstract
Kernel Density Estimation (KDE) is a nonparametric method for estimating the shape of a density function, given a set of samples from the distribution. Recently, locality-sensitive hashing, originally proposed as a tool for nearest neighbor search, has been shown to enable fast KDE data structures. However, these approaches do not take advantage of the many other advances that have been made in algorithms for nearest neighbor algorithms. We present an algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN) where we apply Approximate Nearest Neighbor (ANN) algorithms as a black box subroutine to compute an unbiased KDE. The idea is to find points that have a large contribution to the KDE using ANN, compute their contribution exactly, and approximate the remainder with Random Sampling (RS). We present a theoretical argument that supports the idea that an ANN subroutine can speed up the evaluation. Furthermore, we provide a C++ implementation with a Python interface
Authors
(none)
Tags
Stats
Related papers
- Kernel Density Estimation Through Density Constrained Near Neighbor Search (2020)5.84
- Efficient Data-aware Distance Comparison Operations For High-dimensional Approximate Nearest Neighbor Search (2024)5.24
- EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based On Knn Graph (2016)0.00
- Approximate Nearest Neighbour Search On Dynamic Datasets: An Investigation (2024)0.00
- High-dimensional Approximate Nearest Neighbor Search: With Reliable And Efficient Distance Comparison Operations (2023)13.44
- PECANN: Parallel Efficient Clustering With Graph-based Approximate Nearest Neighbor Search (2023)0.00
- Fast Private Kernel Density Estimation Via Locality Sensitive Quantization (2023)0.00
- Effective And General Distance Computation For Approximate Nearest Neighbor Search (2024)5.84