TATTER: A hypothesis testing tool for multidimensional data
Abstract
The twosample hypothesis test quantifies whether distributions p and q are different, given the corresponding finite samples drawn from each. This problem appears in a legion of applications in astronomy, ranging from data mining to data analysis and inference. For decades, the KolmogorovSmirnov test has been astronomers' first choice to answer this question, but it has a major drawback, a generalization to multidimensional data sets is not straightforward. To fill this gap, we present a nonparametric estimator for comparing given multidimensional distributions drawn from them. This method employs a kernel function to construct an unbiased estimator of the Maximum Mean Discrepancy (MMD) distance between the two distributions that generated the observed data. We perform controlled numerical experiments in Gaussian, nonGaussian, and multidimensional finite sample settings and test the performance of MMD estimator in each experiment. We then discuss some of the applications of this method in astronomy data analysis.
 Publication:

Astronomy and Computing
 Pub Date:
 January 2021
 DOI:
 10.1016/j.ascom.2020.100445
 Bibcode:
 2021A&C....3400445F
 Keywords:

 Methods;
 Data analysis  methods;
 Statistical