HLSTimeSformer: A Self-Supervised Approach to Analyze Geospatial Imagery
Abstract
We propose HLSTimeSformer, a self-supervised learning framework that can learn representations of geospatial imagery. In this work, we examine the Harmonized Landsat Sentinel - 2 (HLS) data product. Our approach predicts a future representation of a region given a sequence of past views. We apply a technique that allows our model to only process the unmasked portions of our image sequence. This addresses the instances where the input images have been obscured due to the angle of the satellite, cloud cover and other aberrations. Our architecture improves the scalability of applying foundation models to remote sensing data and produces a representation that has a very high level of semantic detail. This representation can be used for downstream tasks such as wildfire detection, crop yield prediction, inland flood monitoring etc. On a dataset size of 55,601 images we were able to achieve convergence with a validation loss of 1.82e-3 at 8,000 epochs on 2 NVIDIA A100s with a batch size of 32 where the model was tasked with predicting the last day in a 7 day sequence. The loss was measured using MSE between the model output and the ground truth (the last image in the sequence).
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2022
- Bibcode:
- 2022AGUFMIN32D0402G