Model selection by resampling penalization
Abstract
In this paper, a new family of resampling-based penalization procedures for model selection is defined in a general framework. It generalizes several methods, including Efron's bootstrap penalization and the leave-one-out penalization recently proposed by Arlot (2008), to any exchangeable weighted bootstrap resampling scheme. In the heteroscedastic regression framework, assuming the models to have a particular structure, these resampling penalties are proved to satisfy a non-asymptotic oracle inequality with leading constant close to 1. In particular, they are asympotically optimal. Resampling penalties are used for defining an estimator adapting simultaneously to the smoothness of the regression function and to the heteroscedasticity of the noise. This is remarkable because resampling penalties are general-purpose devices, which have not been built specifically to handle heteroscedastic data. Hence, resampling penalties naturally adapt to heteroscedasticity. A simulation study shows that resampling penalties improve on V-fold cross-validation in terms of final prediction error, in particular when the signal-to-noise ratio is not large.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2009
- DOI:
- 10.48550/arXiv.0906.3124
- arXiv:
- arXiv:0906.3124
- Bibcode:
- 2009arXiv0906.3124A
- Keywords:
-
- Mathematics - Statistics;
- 62G09;
- 62G08;
- 62M20
- E-Print:
- extended version of hal-00125455, with a technical appendix