Feature Selection with Annealing for Computer Vision and Big Data Learning
Abstract
Many computer vision and medical imaging problems are faced with learning from large-scale datasets, with millions of observations and features. In this paper we propose a novel efficient learning scheme that tightens a sparsity constraint by gradually removing variables based on a criterion and a schedule. The attractive fact that the problem size keeps dropping throughout the iterations makes it particularly suitable for big data learning. Our approach applies generically to the optimization of any differentiable loss function, and finds applications in regression, classification and ranking. The resultant algorithms build variable screening into estimation and are extremely simple to implement. We provide theoretical guarantees of convergence and selection consistency. In addition, one dimensional piecewise linear response functions are used to account for nonlinearity and a second order prior is imposed on these functions to avoid overfitting. Experiments on real and synthetic data show that the proposed method compares very well with other state of the art methods in regression, classification and ranking while being computationally very efficient and scalable.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2013
- DOI:
- 10.48550/arXiv.1310.2880
- arXiv:
- arXiv:1310.2880
- Bibcode:
- 2013arXiv1310.2880B
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory
- E-Print:
- 18 pages, 9 figures