SRCEK: A Continuous Embedding of the Channel Selection Problem for weighted PLS Modeling
Abstract
SRCEK, is a technique for selecting useful channels for affine modeling of a response by PLS. The technique embeds the discrete channel selection problem into the continuous space of predictor preweighting, then employs a Quasi-Newton (or other) optimization algorithm to optimize the preweighting vector. Once the weighting vector has been optimized, the magnitudes of the weights indicate the relative importance of each channel. The relative importances are used to construct n different models, the kth consisting of the k most important channels. The different models are then compared by means of cross validation or an information criterion (e.g. BIC), allowing automatic selection of a `good' subset of the channels. The analytical Jacobian of the PLS regression vector with respect to the predictor weighting is derived to facilitate optimization of the latter. This formulation exploits the reduced rank of the predictor matrix to gain some speedup when the number of observations is fewer than the number of predictors (the usual case for e.g. IR spectroscopy). The method compares favourably with predictor selection techniques surveyed by Forina et. al.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2013
- DOI:
- 10.48550/arXiv.1310.2557
- arXiv:
- arXiv:1310.2557
- Bibcode:
- 2013arXiv1310.2557P
- Keywords:
-
- Statistics - Applications;
- 62J05;
- G.3