An upper bound on prototype set size for condensed nearest neighbor
Abstract
The condensed nearest neighbor (CNN) algorithm is a heuristic for reducing the number of prototypical points stored by a nearest neighbor classifier, while keeping the classification rule given by the reduced prototypical set consistent with the full set. I present an upper bound on the number of prototypical points accumulated by CNN. The bound originates in a bound on the number of times the decision rule is updated during training in the multiclass perceptron algorithm, and thus is independent of training set size.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2013
- DOI:
- 10.48550/arXiv.1309.7676
- arXiv:
- arXiv:1309.7676
- Bibcode:
- 2013arXiv1309.7676C
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- This was submitted to the journal Artificial Intelligence in 2009, and while it was considered technically sound, it was also believed to be of minor importance. My research has since moved on, so I'm unlikely to attempt a resubmission