A Diagnostic Tool for Functional Causal Discovery
Abstract
Causal discovery methods aim to determine the causal direction between variables using observational data. Functional causal discovery methods, such as those based on the Linear NonGaussian Acyclic Model (LiNGAM), rely on structural and distributional assumptions to infer the causal direction. However, approaches for assessing causal discovery methods' performance as a function of sample size or the impact of assumption violations, inevitable in realworld scenarios, are lacking. To address this need, we propose Causal Direction Detection Rate (CDDR) diagnostic that evaluates whether and to what extent the interaction between assumption violations and sample size affects the ability to identify the hypothesized causal direction. Given a bivariate dataset of size N on a pair of variables, X and Y, CDDR diagnostic is the plotted comparison of the probability of each causal discovery outcome (e.g. X causes Y, Y causes X, or inconclusive) as a function of sample size less than N. We fully develop CDDR diagnostic in a bivariate case and demonstrate its use for two methods, LiNGAM and our new testbased causal discovery approach. We find CDDR diagnostic for the testbased approach to be more informative since it uses a richer set of causal discovery outcomes. Under certain assumptions, we prove that the probability estimates of detecting each possible causal discovery outcome are consistent and asymptotically normal. Through simulations, we study CDDR diagnostic's behavior when linearity and nonGaussianity assumptions are violated. Additionally, we illustrate CDDR diagnostic on four real datasets, including three for which the causal direction is known.
 Publication:

arXiv eprints
 Pub Date:
 June 2024
 DOI:
 10.48550/arXiv.2406.07787
 arXiv:
 arXiv:2406.07787
 Bibcode:
 2024arXiv240607787P
 Keywords:

 Statistics  Methodology;
 Statistics  Applications