Benford's law in the Gaia universe
Abstract
Context. Benford's law states that for scale and baseinvariant data sets covering a wide dynamic range, the distribution of the first significant digit is biased towards low values. This has been shown to be true for wildly different datasets, including financial, geographical, and atomic data. In astronomy, earlier work showed that Benford's law also holds for distances estimated as the inverse of parallaxes from the ESA HIPPARCOS mission.
Aims: We investigate whether Benford's law still holds for the 1.3 billion parallaxes contained in the second data release of Gaia (Gaia DR2). In contrast to previous work, we also include negative parallaxes. We examine whether distance estimates computed using a Bayesian approach instead of parallax inversion still follow Benford's law. Lastly, we investigate the use of Benford's law as a validation tool for the zeropoint of the Gaia parallaxes.
Methods: We computed histograms of the observed most significant digit of the parallaxes and distances, and compared them with the predicted values from Benford's law, as well as with theoretically expected histograms. The latter were derived from a simulated Gaia catalogue based on the Besançon galaxy model.
Results: The observed parallaxes in Gaia DR2 indeed follow Benford's law. Distances computed with the Bayesian approach of BailerJones et al. (2018, AJ, 156, 58) no longer follow Benford's law, although lowvalue ciphers are still favoured for the most significant digit. The prior that is used has a significant effect on the digit distribution. Using the simulated Gaia universe model snapshot, we demonstrate that the true distances underlying the Gaia catalogue are not expected to follow Benford's law, essentially because the interplay between the luminosity function of the Milky Way and the mission selection function results in a bimodal distance distribution, corresponding to nearby dwarfs in the Galactic disc and distant giants in the Galactic bulge. In conclusion, Gaia DR2 parallaxes only follow Benford's Law as a result of observational errors. Finally, we show that a zeropoint offset of the parallaxes derived by optimising the fit between the observed mostsignificant digit frequencies and Benford's law leads to a value that is inconsistent with the value that is derived from quasars. The underlying reason is that such a fit primarily corrects for the difference in the number of positive and negative parallaxes, and can thus not be used to obtain a reliable zeropoint.
 Publication:

Astronomy and Astrophysics
 Pub Date:
 October 2020
 DOI:
 10.1051/00046361/201937256
 arXiv:
 arXiv:2008.12271
 Bibcode:
 2020A&A...642A.205D
 Keywords:

 astronomical databases: miscellaneous;
 astrometry;
 stars: distances;
 Astrophysics  Astrophysics of Galaxies
 EPrint:
 A&