Improving Photometric Redshift Estimates with Training Sample Augmentation
Abstract
Large imaging surveys will rely on photometric redshifts (photo-z's), which are typically estimated through machine-learning methods. Currently planned spectroscopic surveys will not be deep enough to produce a representative training sample for Legacy Survey of Space and Time (LSST), so we seek methods to improve the photo-z estimates that arise from nonrepresentative training samples. Spectroscopic training samples for photo-z's are biased toward redder, brighter galaxies, which also tend to be at lower redshift than the typical galaxy observed by LSST, leading to poor photo-z estimates with outlier fractions nearly 4 times larger than for a representative training sample. In this Letter, we apply the concept of training sample augmentation, where we augment simulated nonrepresentative training samples with simulated galaxies possessing otherwise unrepresented features. When we select simulated galaxies with (g-z) color, i-band magnitude, and redshift outside the range of the original training sample, we are able to reduce the outlier fraction of the photo-z estimates for simulated LSST data by nearly 50% and the normalized median absolute deviation (NMAD) by 56%. When compared to a fully representative training sample, augmentation can recover nearly 70% of the degradation in the outlier fraction and 80% of the degradation in NMAD. Training sample augmentation is a simple and effective way to improve training samples for photo-z's without requiring additional spectroscopic samples.
- Publication:
-
The Astrophysical Journal
- Pub Date:
- May 2024
- DOI:
- 10.3847/2041-8213/ad4039
- arXiv:
- arXiv:2402.15551
- Bibcode:
- 2024ApJ...967L...6M
- Keywords:
-
- Observational cosmology;
- 1146;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Astrophysics - Cosmology and Nongalactic Astrophysics
- E-Print:
- 11 pages, 4 figures, published in ApJ Letters