A Scalable Empirical Bayes Approach to Variable Selection in Generalized Linear Models
Abstract
A new empirical Bayes approach to variable selection in the context of generalized linear models is developed. The proposed algorithm scales to situations in which the number of putative explanatory variables is very large, possibly much larger than the number of responses. The coefficients in the linear predictor are modeled as a threecomponent mixture allowing the explanatory variables to have a random positive effect on the response, a random negative effect, or no effect. A key assumption is that only a small (but unknown) fraction of the candidate variables have a nonzero effect. This assumption, in addition to treating the coefficients as random effects facilitates an approach that is computationally efficient. In particular, the number of parameters that have to be estimated is small, and remains constant regardless of the number of explanatory variables. The model parameters are estimated using a Generalized Alternating Maximization algorithm which is scalable, and leads to significantly faster convergence compared with simulationbased fully Bayesian methods.
 Publication:

arXiv eprints
 Pub Date:
 March 2018
 DOI:
 10.48550/arXiv.1803.09735
 arXiv:
 arXiv:1803.09735
 Bibcode:
 2018arXiv180309735B
 Keywords:

 Statistics  Methodology
 EPrint:
 arXiv admin note: text overlap with arXiv:1510.03781