On the Identifiability and Estimation of Causal LocationScale Noise Models
Abstract
We study the class of locationscale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i.e., $Y = f(X) + g(X)N$. Despite the generality of the model class, we show the causal direction is identifiable up to some pathological cases. To empirically validate these theoretical findings, we propose two estimators for LSNMs: an estimator based on (nonlinear) feature maps, and one based on neural networks. Both model the conditional distribution of $Y$ given $X$ as a Gaussian parameterized by its natural parameters. When the feature maps are correctly specified, we prove that our estimator is jointly concave, and a consistent estimator for the causeeffect identification task. Although the the neural network does not inherit those guarantees, it can fit functions of arbitrary complexity, and reaches stateoftheart performance across benchmarks.
 Publication:

arXiv eprints
 Pub Date:
 October 2022
 DOI:
 10.48550/arXiv.2210.09054
 arXiv:
 arXiv:2210.09054
 Bibcode:
 2022arXiv221009054I
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Artificial Intelligence;
 Computer Science  Machine Learning
 EPrint:
 ICML 2023