Optimizing Data-driven Causal Discovery Using Knowledge-guided Search
Abstract
Learning causal relationships solely from observational data often fails to reveal the underlying causal mechanisms due to the vast search space of possible causal graphs, which can grow exponentially, especially for greedy algorithms using score-based approaches. Leveraging prior causal information, such as the presence or absence of causal edges, can help restrict and guide the score-based discovery process, leading to a more accurate search. In the healthcare domain, prior knowledge is abundant from sources like medical journals, electronic health records (EHRs), and clinical intervention outcomes. This study introduces a knowledge-guided causal structure search (KGS) approach that utilizes observational data and structural priors (such as causal edges) as constraints to learn the causal graph. KGS leverages prior edge information between variables, including the presence of a directed edge, the absence of an edge, and the presence of an undirected edge. We extensively evaluate KGS in multiple settings using synthetic and benchmark real-world datasets, as well as in a real-life healthcare application related to oxygen therapy treatment. To obtain causal priors, we use GPT-4 to retrieve relevant literature information. Our results show that structural priors of any type and amount enhance the search process, improving performance and optimizing causal discovery. This guided strategy ensures that the discovered edges align with established causal knowledge, enhancing the trustworthiness of findings while expediting the search process. It also enables a more focused exploration of causal mechanisms, potentially leading to more effective and personalized healthcare solutions.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- arXiv:
- arXiv:2304.05493
- Bibcode:
- 2023arXiv230405493H
- Keywords:
-
- Computer Science - Artificial Intelligence