The Importance of Being Interpretable: Toward An Understandable Machine Learning Encoder for Galaxy Cluster Cosmology
We present a deep machine learning (ML) approach to constraining cosmological parameters with multi-wavelength observations of galaxy clusters. The ML approach has two components: an encoder that builds a compressed representation of each galaxy cluster and a flexible CNN to estimate the cosmological model from a cluster sample. It is trained and tested on simulated cluster catalogs built from the Magneticum simulations. From the simulated catalogs, the ML method estimates the amplitude of matter fluctuations, sigma_8, at approximately the expected theoretical limit. More importantly, the deep ML approach can be interpreted. We lay out three schemes for interpreting the ML technique: a leave-one-out method for assessing cluster importance, an average saliency for evaluating feature importance, and correlations in the terse layer for understanding whether an ML technique can be safely applied to observational data. These interpretation schemes led to the discovery of a previously unknown self-calibration mode for flux- and volume-limited cluster surveys. We describe this new mode, which uses the amplitude and peak of the cluster mass PDF as anchors for mass calibration. We introduce the term "overspecialized" to describe a common pitfall in astronomical applications of machine learning in which the ML method learns simulation-specific details, and we show how a carefully constructed architecture can be used to check for this source of systematic error.