Assessing prediction error of nonparametric regression and classification under Bregman divergence
Abstract
Prediction error is critical to assessing the performance of statistical methods and selecting statistical models. We propose the cross-validation and approximated cross-validation methods for estimating prediction error under a broad q-class of Bregman divergence for error measures which embeds nearly all of the commonly used loss functions in regression, classification procedures and machine learning literature. The approximated cross-validation formulas are analytically derived, which facilitate fast estimation of prediction error under the Bregman divergence. We then study a data-driven optimal bandwidth selector for the local-likelihood estimation that minimizes the overall prediction error or equivalently the covariance penalty. It is shown that the covariance penalty and cross-validation methods converge to the same mean-prediction-error-criterion. We also propose a lower-bound scheme for computing the local logistic regression estimates and demonstrate that it is as simple and stable as the local least-squares regression estimation. The algorithm monotonically enhances the target local-likelihood and converges. The idea and methods are extended to the generalized varying-coefficient models and semiparametric models.
- Publication:
-
arXiv Mathematics e-prints
- Pub Date:
- June 2005
- DOI:
- arXiv:
- arXiv:math/0506028
- Bibcode:
- 2005math......6028F
- Keywords:
-
- Mathematics - Statistics;
- 62G05 62H30
- E-Print:
- 38 pages, 8 figures