Fast And Multiphase Rates For Nearest Neighbor Classifiers
2023 Β· Pengkun Yang, Jingzhao Zhang
Abstract
We study the scaling of classification error rates with respect to the size of the training dataset. In contrast to classical results where rates are minimax optimal for a problem class, this work starts with the empirical observation that, even for a fixed data distribution, the error scaling can have *diverse* rates across different ranges of sample size. To understand when and why the error rate is non-uniform, we theoretically analyze nearest neighbor classifiers. We show that an error scaling law can have fine-grained rates: in the early phase, the test error depends polynomially on the data dimension and decreases fast; whereas in the later phase, the error depends exponentially on the data dimension and decreases slowly. Our analysis highlights the complexity of the data distribution in determining the test error. When the data are distributed benignly, we show that the generalization error of nearest neighbor classifier can depend polynomially, instead of exponentially, on the
Authors
(none)
Tags
Stats
Related papers
- On High-dimensional Modifications Of The Nearest Neighbor Classifier (2024)0.00
- Minimax Rate Optimal Adaptive Nearest Neighbor Classification And Regression (2019)8.35
- Fast And Bayes-consistent Nearest Neighbors (2019)0.00
- Fast Nearest-neighbor Classification Using RNN In Domains With Large Number Of Classes (2017)0.00
- Minimax Optimal Algorithms With Fixed-\(k\)-nearest Neighbors (2022)0.00
- An Adaptive Nearest Neighbor Rule For Classification (2019)0.00
- Nearest-neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions (2017)0.00
- Explaining The Success Of Nearest Neighbor Methods In Prediction (2025)18.63