A stable and adaptive polygenic signal detection method based on repeated sample splitting
Abstract
Focusing on polygenic signal detection in high dimensional genetic association studies of complex traits, we develop an adaptive test for generalized linear models to accommodate different alternatives. To facilitate valid post-selection inference for high dimensional data, our study here adheres to the original sampling-splitting principle but does so, repeatedly, to increase stability of the inference. We show the asymptotic null distributions of the proposed test for both fixed and diverging number of variants. We also show the asymptotic properties of the proposed test under local alternatives, providing insights on why power gain attributed to variable selection and weighting can compensate for efficiency loss due to sample splitting. We support our analytical findings through extensive simulation studies and two applications. The proposed procedure is computationally efficient and has been implemented as the R package DoubleCauchy.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2020
- DOI:
- 10.48550/arXiv.2008.02442
- arXiv:
- arXiv:2008.02442
- Bibcode:
- 2020arXiv200802442Z
- Keywords:
-
- Statistics - Methodology
- E-Print:
- 24 pages