From petascale to exascale, the future of simulated climate data (Invited)
Abstract
Coleridge ought to have said: data, data, everywhere, and all the data centres groan, data data everywhere, nor any I should clone. Except of course, he didn't say it, and we do clone data! While we've been dealing with terabytes of simulated datasets, downloading ("cloning") and analysing, has been a plausible way forward. In doing so, we have set up systems that support four broad classes of activities: personal and institutional data analysis, federated data systems, and data portals. We use metadata to manage the migration of data between these (and their communities) and we have built software systems. However, our metadata and software solutions are fragile, often based on soft money, and loose governance arrangements. We often download data with minimal provenance, and often many of us download the same data. In the not too distant future we can imagine exabytes of data being produced, and all these problems will get worse. Arguably we have no plausible methods of effectively exploiting such data - particularly if the analysis requires intercomparison. Yet of course, we know full well that intercomparison is at the heart of climate science. In this talk, we review the current status of simulation data management, with special emphasis on accessibility and usability. We talk about file formats, bundles of files, real and virtual, and simulation metadata. We introduce the InfraStructure for the European Network for Earth Simulation (IS-ENES) and its relationship with the Earth System Grid Federation (ESGF) as well as JASMIN, the UK Joint Analysis System. There will be a small digression on parallel data analysis - locally and distributed. we then progress to the near term problems (and solutions) for climate data before scoping out the problems of the future, both for data handling, and the models that produce the data. The way we think about data, computing, models, even ensemble design, may need to change.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2013
- Bibcode:
- 2013AGUFM.U41A..04L
- Keywords:
-
- 1622 GLOBAL CHANGE Earth system modeling;
- 1626 GLOBAL CHANGE Global climate models;
- 1904 INFORMATICS Community standards;
- 1932 INFORMATICS High-performance computing