Efficient Distance Approximation for Structured HighDimensional Distributions via Learning
Abstract
We design efficient distance approximation algorithms for several classes of structured highdimensional distributions. Specifically, we show algorithms for the following problems:  Given sample access to two Bayesian networks $P_1$ and $P_2$ over known directed acyclic graphs $G_1$ and $G_2$ having $n$ nodes and bounded indegree, approximate $d_{tv}(P_1,P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time  Given sample access to two ferromagnetic Ising models $P_1$ and $P_2$ on $n$ variables with bounded width, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time  Given sample access to two $n$dimensional Gaussians $P_1$ and $P_2$, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time  Given access to observations from two causal models $P$ and $Q$ on $n$ variables that are defined over known causal graphs, approximate $d_{tv}(P_a, Q_a)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples, where $P_a$ and $Q_a$ are the interventional distributions obtained by the intervention $do(A=a)$ on $P$ and $Q$ respectively for a particular variable $A$. Our results are the first efficient distance approximation algorithms for these wellstudied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for {\em tolerant} testing of closeness of the abovementioned structured highdimensional distributions.
 Publication:

arXiv eprints
 Pub Date:
 February 2020
 arXiv:
 arXiv:2002.05378
 Bibcode:
 2020arXiv200205378B
 Keywords:

 Computer Science  Data Structures and Algorithms;
 Computer Science  Machine Learning
 EPrint:
 24 pages, 1 figure