Coping with the deluge of ';Big Data': The challenge of exploiting satellite earth observation data in the new era of High Performance Data
Abstract
Australia's Earth Observation Program has acquired and archived satellite data for the Australian Government since the establishment of the Australian Landsat Station in 1979. Data have been acquired from many sensors and platforms including ERS, EnviSAT, MODIS, ASTER, SPOT and ALOS, although the bulk of the continuous observations are from the Landsat instruments. The Landsat mission is the longest continuous environmental monitoring experiment in history; producing a global archive of earth observations spanning over 41 years. Geoscience Australia maintains an archive of Landsat data for Australia and produces products and information to support the delivery of government policy objectives. Future Earth observation missions promise an exponential increase in the volumes of open data from Earth observing satellites. For the Australian region the NASA/USGS Landsat-8 satellite is now contributing up to 50 GB of data per day and ESA's Sentinel-2 constellation (due for launch in early 2014) will provide close to 500 GB of data per day to Australia's existing archive of earth observation data. With just these two new data sources the Australian Satellite Earth Observation archive is expected to grow to around 1 PB by the end of 2014. Extracting information from satellite data is a long-standing challenge made more difficult by increased data volumes. Recognising this issue, the Australian Government funded the ';Unlocking the Landsat Archive' (ULA) consortium project from 2010 to 2013 to process Australia's Landsat archive to fully calibrated sensor and scene independent data products for the period from 1998 to 2012 and to investigate methods of arranging this archive so that it can be exploited to produce value added information products. The data outputs from the ULA project, currently totalling close to 400 TB, have become a fundamental component of Australia's eResearch infrastructure. The data are hosted on the National Computational Infrastructure (NCI) and are openly available under a Creative Commons licence. A key challenge for data custodians in this era of High Performance Data (HPD) is how to store and organise very large datasets that extend from terra-scale to peta-scale in a way that will facilitate data interoperability. Building on the foundation laid by the ULA project, Geoscience Australia is developing advanced data cube technologies that will enable the Australian Government to cope with the anticipated flow of new Earth observation data from future platforms. By collocating the data with the high performance computing capability of the NCI, and taking a standards approach to provide access to this data as a service, there is an opportunity to truly unlock the potential of Earth observation data to address questions in ways that previously were not possible.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2013
- Bibcode:
- 2013AGUFM.U41A..03P
- Keywords:
-
- 1932 INFORMATICS High-performance computing;
- 1930 INFORMATICS Data and information governance;
- 1936 INFORMATICS Interoperability;
- 1988 INFORMATICS Temporal analysis and representation