Scientific workflows, reproducibility and uncertainty quantification in the paleogeosciences
Abstract
Paleogeoscientists piece together an understanding of past climates, environments, ecosystems and more from a diverse array of natural archives. Understanding local changes, and the various filters by which a system filters and records an environmental signal is complex, and that complexity multiplies when seeking to understand spatial patterns of change and their causes.
The lack of data standards, efficient access to data, and the complexity of approaches needed to integrate paleogeoscientific data led to severe lack of transparency and reproducibility, especially in large synthetic studies in paleoclimatology. This lack of transparency, and the public interest and political significance associated with some paleoclimate studies has led to several high-profile controversies. These were fueled, in part, by a lack of reproducibility. Thankfully, open data standards, open software and pathways for sharing scientific workflows are now emerging. Here, we evaluate the status of these efforts in the paleogeosciences. Specifically, we will discuss the emergence and adoption of the Linked PaleoData (LiPD) framework1, the expansion of community-curated data repositories, such as Neotoma2 and LinkedEarth3, and the development of open software that is integrated with these data sources and supports transparent, end-to-end workflows. We illustrate the potential of these advances with a case study that takes advantage of the efforts of the Earthcube P418 Geodex4 project to integrate complementary datasets from LinkedEarth and Neotoma, quantify uncertainty that emerges in the analyses due to geochronologic uncertainty, and visualize the results, using the GeoChronR5 package. The entire process is documented in a scientific workflow that results in full transparency and reproducibility. Finally, we evaluate the adoption of these emerging technologies, highlight applications, and discuss community needs. 1 http://lipd.net 2 https://www.neotomadb.org/ 3 http://linked.earth/ 4 http://geodex.org/ 5 https://nickmckay.github.io/GeoChronR/- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMIN41A..07M
- Keywords:
-
- 0520 Data analysis: algorithms and implementation;
- COMPUTATIONAL GEOPHYSICSDE: 1904 Community standards;
- INFORMATICSDE: 1976 Software tools and services;
- INFORMATICSDE: 1978 Software re-use;
- INFORMATICS