Iterative feature selection in least square regression estimation
Abstract
In this paper, we focus on regression estimation in both the inductive and the transductive case. We assume that we are given a set of features (which can be a base of functions, but not necessarily). We begin by giving a deviation inequality on the risk of an estimator in every model defined by using a single feature. These models are too simple to be useful by themselves, but we then show how this result motivates an iterative algorithm that performs feature selection in order to build a suitable estimator. We prove that every selected feature actually improves the performance of the estimator. We give all the estimators and results at first in the inductive case, which requires the knowledge of the distribution of the design, and then in the transductive case, in which we do not need to know this distribution.
- Publication:
-
Annales de L'Institut Henri Poincare Section (B) Probability and Statistics
- Pub Date:
- February 2008
- DOI:
- arXiv:
- arXiv:math/0511299
- Bibcode:
- 2008AIHPB..44...47A
- Keywords:
-
- Mathematics - Statistics Theory;
- 62G08 (Primary);
- 62G15;
- 68T05 (Secondary)
- E-Print:
- Annales de l'Institut Henri Poincare (B) Probability and Statistics' 48, 1 (2008) p47-88