Quantile-based classifiers
Abstract
Quantile classifiers for potentially high-dimensional data are defined by classifying an observation according to a sum of appropriately weighted component-wise distances of the components of the observation to the within-class quantiles. An optimal percentage for the quantiles can be chosen by minimizing the misclassification error in the training sample. It is shown that this is consistent, for $n \to \infty$, for the classification rule with asymptotically optimal quantile, and that, under some assumptions, for $p\to\infty$ the probability of correct classification converges to one. The role of skewness of the involved variables is discussed, which leads to an improved classifier. The optimal quantile classifier performs very well in a comprehensive simulation study and a real data set from chemistry (classification of bioaerosols) compared to nine other classifiers, including the support vector machine and the recently proposed median-based classifier (Hall et al., 2009), which inspired the quantile classifier.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2013
- DOI:
- 10.48550/arXiv.1303.1282
- arXiv:
- arXiv:1303.1282
- Bibcode:
- 2013arXiv1303.1282H
- Keywords:
-
- Statistics - Methodology