Evaluating Sparse Galaxy Simulations via Out-of-Distribution Detection and Amortized Bayesian Model Comparison
Abstract
Cosmological simulations are a powerful tool to advance our understanding of galaxy formation and many simulations model key properties of real galaxies. A question that naturally arises for such simulations in light of high-quality observational data is: How close are the models to reality? Due to the high-dimensionality of the problem, many previous studies evaluate galaxy simulations using simplified summary statistics of physical properties. In this work, we combine simulation-based Bayesian model comparison with a novel misspecification detection technique to compare simulated galaxy images of 6 hydrodynamical models observations. Since cosmological simulations are computationally costly, we address the problem of low simulation budgets by first training a $k$-sparse variational autoencoder (VAE) on the abundant dataset of SDSS images. The VAE learns to extract informative latent embeddings and delineates the typical set of real images. To reveal simulation gaps, we then perform out-of-distribution detection (OOD) based on the logits of classifiers trained on the embeddings of simulated images. Finally, we perform amortized Bayesian model comparison using probabilistic classification, identifying the relatively best-performing model along with partial explanations through SHAP values.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2024
- DOI:
- 10.48550/arXiv.2410.10606
- arXiv:
- arXiv:2410.10606
- Bibcode:
- 2024arXiv241010606Z
- Keywords:
-
- Astrophysics - Astrophysics of Galaxies;
- Physics - Computational Physics;
- Physics - Data Analysis;
- Statistics and Probability;
- Physics - Space Physics
- E-Print:
- accepted to the Machine Learning and the Physical Sciences workshop at Neurips 2024