Finding Relevant Points For Nearest-neighbor Classification
2021 Β· David Eppstein
Abstract
In nearest-neighbor classification problems, a set of \(d\)-dimensional training points are given, each with a known classification, and are used to infer unknown classifications of other points by using the same classification as the nearest training point. A training point is relevant if its omission from the training set would change the outcome of some of these inferences. We provide a simple algorithm for thinning a training set down to its subset of relevant points, using as subroutines algorithms for finding the minimum spanning tree of a set of points and for finding the extreme points (convex hull vertices) of a set of points. The time bounds for our algorithm, in any constant dimension \(d\ge 3\), improve on a previous algorithm for the same problem by Clarkson (FOCS 1994).
Authors
(none)
Tags
Stats
Related papers
- Reducing Nearest Neighbor Training Sets Optimally And Exactly (2023)0.00
- On High-dimensional Modifications Of The Nearest Neighbor Classifier (2024)0.00
- Sublinear Data Structures For Nearest Neighbor In Ultra High Dimensions (2025)0.00
- Social Distancing Is Good For Points Too! (2020)0.00
- Fast And Bayes-consistent Nearest Neighbors (2019)0.00
- Explaining The Success Of Nearest Neighbor Methods In Prediction (2025)18.63
- A Multilabel Classification Framework For Approximate Nearest Neighbor Search (2019)0.00
- A Continuation Method For Discrete Optimization And Its Application To Nearest Neighbor Classification (2018)0.00