Federated causal inference for out-of-distribution generalization in predicting physiological effects of radiation exposure
Abstract
Title: Federated causal inference for out-of-distribution generalization in predicting physiological effects of radiation exposure Authors: Paul Duckworth*, Odhran O'Donoghue*, Linus Scheibenreif*, Giuseppe Ughi*, Kia Khezeli, Adrienne Hoarfrost, Samuel Budd, Nicholas Chia, Patrick Foley, Graham Mackintosh, John Kalantari, Frank Soboczenski and Lauren Sanders *Equal contributions from first four authors. Abstract: The physiological effects of radiation exposure are a health risk for a variety of groups, including cancer patients undergoing radiotherapy, and astronauts in space. Understanding genetic risk factors is key for preventing and mitigating adverse effects of radiation exposure. To enhance our understanding of such genetic risk factors, researchers must overcome several outstanding data challenges. Specifically, there are important challenges related to data imbalance and data access. The amount of relevant model organism (e.g. mouse) data far outweighs the amount of human data, and human medical data can be difficult to obtain due to security restrictions or bandwidth limitation. To address these issues, we leveraged and expanded upon the Causal Relation and Inference Search Platform (CRISP), an ensemble causal learning platform for identifying candidate biomarkers of disease progression in heterogeneous multi-omics data. To assess the applicability of CRISP to data from different frameworks and distributions, we developed a set of synthetic datasets with multi-class targets and Bernoulli random variables, which include hidden confounders and environmental behaviours. Furthermore, we formally evaluated the efficacy of new methods in the ensemble, as well as the efficacy of dimensionality reduction with observations coming from different frameworks. In order to allow CRISP to run on datasets in remote or firewalled locations, we implemented federated causal inference through Intel's OpenFL project. Finally, we evaluated the compatibility of mouse and human radiation exposure transcriptomic data and assessed statistically the amount of data needed to approximate human-only results. We present an early view of the pipeline with federated learning capabilities for causal inference of genetic biomarkers of radiation-induced carcinogenesis.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2021
- Bibcode:
- 2021AGUFMIN12A..04S