Fitting High-Dimensional Interaction Models with Error Control
Abstract
There is a renewed interest in polynomial regression in the form of identifying influential interactions between features. In many settings, this takes place in a high-dimensional model, making the number of interactions unwieldy or computationally infeasible. Furthermore, it is difficult to analyze such spaces directly as they are often highly correlated. Standard feature selection issues remain such as how to determine a final model which generalizes well. This paper solves these problems with a sequential algorithm called Revisiting Alpha-Investing (RAI). RAI is motivated by the principle of marginality and searches the feature-space of higher-order interactions by greedily building upon lower-order terms. RAI controls a notion of false rejections and comes with a performance guarantee relative to the best-subset model. This ensures that signal is identified while providing a valid stopping criterion to prevent over-selection. We apply RAI in a novel setting over a family of regressions in order to select gene-specific interaction models for differential expression profiling.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2015
- DOI:
- 10.48550/arXiv.1510.06322
- arXiv:
- arXiv:1510.06322
- Bibcode:
- 2015arXiv151006322J
- Keywords:
-
- Statistics - Methodology