Identifying important predictors in large data bases -- multiple testing and model selection
Abstract
This is a chapter of the forthcoming Handbook of Multiple Testing. We consider a variety of model selection strategies in a high-dimensional setting, where the number of potential predictors p is large compared to the number of available observations n. In particular modifications of information criteria which are suitable in case of p > n are introduced and compared with a variety of penalized likelihood methods, in particular SLOPE and SLOBE. The focus is on methods which control the FDR in terms of model identification. Theoretical results are provided both with respect to model identification and prediction and various simulation results are presented which illustrate the performance of the different methods in different situations.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2020
- DOI:
- arXiv:
- arXiv:2011.12154
- Bibcode:
- 2020arXiv201112154B
- Keywords:
-
- Statistics - Methodology