Reducing Nearest Neighbor Training Sets Optimally And Exactly

Abstract

In nearest-neighbor classification, a training set \(P\) of points in \(\mathbb\{R\}^d\) with given classification is used to classify every point in \(\mathbb\{R\}^d\): Every point gets the same classification as its nearest neighbor in \(P\). Recently, Eppstein [SOSA'22] developed an algorithm to detect the relevant training points, those points \(p\in P\), such that \(P\) and \(P\setminus\\{p\\}\) induce different classifications. We investigate the problem of finding the minimum cardinality reduced training set \(P'\subseteq P\) such that \(P\) and \(P'\) induce the same classification. We show that the set of relevant points is such a minimum cardinality reduced training set if \(P\) is in general position. Furthermore, we show that finding a minimum cardinality reduced training set for possibly degenerate \(P\) is in P for \(d=1\), and NP-complete for \(d\geq 2\).

Reducing Nearest Neighbor Training Sets Optimally And Exactly

Abstract

Authors

Tags

Stats

Related papers