SubDLe: Identification of substructures in cosmological simulations with deep learning. An image segmentation approach to substructure finding
Abstract
Context. The identification of substructures within halos in cosmological hydrodynamical simulations is a fundamental step to identify the simulated counterparts of real objects, namely galaxies. For this reason, substructure finders play a crucial role in extracting relevant information from the simulation outputs. In general, they are based on physically motivated definitions of substructures, performing multiple steps of particle-by-particle operations, and for this reason they are computationally expensive.
Aims: The purpose of this work is to develop a fast algorithm to identify substructures, especially galaxies, in simulations. The final aim, besides a faster production of subhalo catalogs, is to provide an algorithm fast enough to be applied with a fine time cadence during the evolution of the simulations. Having access to galaxy catalogs while the simulation is evolving is indeed necessary for sub-resolution models based on the global properties of galaxies.
Methods: In this context, machine learning methods offer a wide range of automated tools for fast analysis of large data sets. So, we chose to apply the architecture of a well-known fully convolutional network, U-Net, for the identification of substructures within the mass density field of the simulation. We have developed SubDLe (Substructure identification with Deep Learning), an algorithm that combines a 3D generalization of U-Net and a Friends-of-Friends algorithm, and trained it to reproduce the identification of substructures performed by the SubFind algorithm in a set of zoom-in cosmological hydrodynamical simulations of galaxy clusters. For the feasibility study presented in this work, we have trained and tested SubDLe on galaxy clusters at z = 0, using a NVIDIA P100 GPU. We focused our tests on the version of the algorithm working on the identification of purely stellar substructures, stellar SubDLe.
Results: Our stellar SubDLe proved very efficient in identifying most of the galaxies, 82% on average, in a set of 12 clusters at z = 0. In order to prove the robustness of the method, we also performed some tests at z = 1 and increased the resolution of the input density grids. The average time taken by our SubDLe to analyze one cluster is about 70 s, around a factor 30 less than the typical time taken by SubFind in a single computing node.
Conclusions: Our stellar SubDLe is capable of identifying the majority of galaxies in the challenging high-density environment of galaxy clusters in short computing times. This result has interesting implications in view of the possibility of integrating fast subhalo finders within simulation codes, which can take advantage of accelerators available in state-of-the-art computing nodes.
- Publication:
-
Astronomy and Astrophysics
- Pub Date:
- September 2024
- DOI:
- arXiv:
- arXiv:2405.18257
- Bibcode:
- 2024A&A...689A..33E
- Keywords:
-
- hydrodynamics;
- methods: data analysis;
- methods: numerical;
- galaxies: clusters: general;
- Astrophysics - Cosmology and Nongalactic Astrophysics;
- Astrophysics - Instrumentation and Methods for Astrophysics
- E-Print:
- 17 pages, 15 figures, submitted to A&