The Emptiness Inside: Finding Gaps, Valleys, and Lacunae with Geometric Data Analysis
Abstract
Discoveries of gaps in data have been important in astrophysics. For example, there are kinematic gaps opened by resonances in dynamical systems, or exoplanets of a certain radius that are empirically rare. A gap in a data set is a kind of anomaly, but in an unusual sense: instead of being a single outlier data point, situated far from other data points, it is a region of the space, or a set of points, that is anomalous compared to its surroundings. Gaps are both interesting and hard to find and characterize, especially when they have nontrivial shapes. We present in this paper a statistic that can be used to estimate the (local) "gappiness" of a point in the data space. It uses the gradient and Hessian of the density estimate (and thus requires a twice-differentiable density estimator). This statistic can be computed at (almost) any point in the space and does not rely on optimization; it allows us to highlight underdense regions of any dimensionality and shape in a general and efficient way. We illustrate our method on the velocity distribution of nearby stars in the Milky Way disk plane, which exhibits gaps that could originate from different processes. Identifying and characterizing those gaps could help determine their origins. We provide in an appendix implementation notes and additional considerations for finding underdensities in data, using critical points and the properties of the Hessian of the density. 7 7 A Python implementation of t methods presented here is available at https://github.com/contardog/FindTheGap.
- Publication:
-
The Astronomical Journal
- Pub Date:
- November 2022
- DOI:
- 10.3847/1538-3881/ac961e
- arXiv:
- arXiv:2201.10674
- Bibcode:
- 2022AJ....164..226C
- Keywords:
-
- Astronomy data analysis;
- Computational astronomy;
- Astrostatistics techniques;
- Milky Way dynamics;
- 1858;
- 293;
- 1886;
- 1051;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Astrophysics - Astrophysics of Galaxies
- E-Print:
- 17 pages, 10 figures. Submitted to AJ. Comments welcomed. Revision: added 3D gridding + restructured outline: implementation notes (Quadratic Kernel) and methods for approx critical points and 1d-valley now in Annex