Reconstructing and Classifying SDSS DR16 Galaxy Spectra with Machine-Learning and Dimensionality Reduction Algorithms
Optical spectra of galaxies and quasars from large cosmological surveys are used to measure redshifts and infer distances. They are also rich with information on the intrinsic properties of these astronomical objects. However, their physical interpretation can be challenging due to the substantial number of degrees of freedom, various sources of noise, and degeneracies between physical parameters that cause similar spectral characteristics. To gain deeper insights into these degeneracies, we apply two unsupervised machine learning frameworks to a sample from the Sloan Digital Sky Survey data release 16 (SDSS DR16). The first framework is a Probabilistic Auto-Encoder (PAE), a two-stage deep learning framework consisting of a data compression stage from 1000 elements to 10 parameters and a density estimation stage. The second framework is a Uniform Manifold Approximation and Projection (UMAP), which we apply to both the uncompressed and compressed data. Exploring across regions on the compressed data UMAP, we construct sequences of stacked spectra which show a gradual transition from star-forming galaxies with narrow emission lines and blue spectra to passive galaxies with absorption lines and red spectra. Focusing on galaxies with broad emission lines produced by quasars, we find a sequence with varying levels of obscuration caused by cosmic dust. The experiments we present here inform future applications of neural networks and dimensionality reduction algorithms for large astronomical spectroscopic surveys.
- Pub Date:
- November 2022
- Astrophysics - Astrophysics of Galaxies;
- Astrophysics - Cosmology and Nongalactic Astrophysics
- ASP Conference Series, Compendium of Undergraduate Research in Astronomy and Space Science (accepted), 24 pages, 14 figures