Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In these studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.
- Pub Date:
- July 2020
- Computer Science - Machine Learning;
- Quantitative Biology - Genomics;
- Statistics - Applications;
- Statistics - Machine Learning
- Accepted at ICML 2020 workshop on ML Interpretability for Scientific Discovery