Astronomical surveys of celestial sources produce streams of noisy time series measuring flux versus time (`light curves'). Unlike in many other physical domains, however, large (and source-specific) temporal gaps in data arise naturally due to intranight cadence choices as well as diurnal and seasonal constraints1-5. With nightly observations of millions of variable stars and transients from upcoming surveys4,6, efficient and accurate discovery and classification techniques on noisy, irregularly sampled data must be employed with minimal human-in-the-loop involvement. Machine learning for inference tasks on such data traditionally requires the laborious hand-coding of domain-specific numerical summaries of raw data (`features')7. Here, we present a novel unsupervised autoencoding recurrent neural network8 that makes explicit use of sampling times and known heteroskedastic noise properties. When trained on optical variable star catalogues, this network produces supervised classification models that rival other best-in-class approaches. We find that autoencoded features learned in one time-domain survey perform nearly as well when applied to another survey. These networks can continue to learn from new unlabelled observations and may be used in other unsupervised tasks, such as forecasting and anomaly detection.
- Pub Date:
- November 2018
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Astrophysics - Solar and Stellar Astrophysics;
- Physics - Data Analysis;
- Statistics and Probability
- 23 pages, 14 figures. The published version is at Nature Astronomy (https://www.nature.com/articles/s41550-017-0321-z). Source code for models, experiments, and figures at https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper (Zenodo Code DOI: 10.5281/zenodo.1045560)