A Virtual Science Data Environment for Carbon Dioxide Observations
Abstract
Climate science data are often distributed cross-institutionally and made available using heterogeneous interfaces. With respect to observational carbon-dioxide (CO2) records, these data span across national as well as international institutions and are typically distributed using a variety of data standards. Such an arrangement can yield challenges from a research perspective, as users often need to independently aggregate datasets as well as address the issue of data quality. To tackle this dispersion and heterogeneity of data, we have developed the CO2 Virtual Science Data Environment - a comprehensive approach to virtually integrating CO2 data and metadata from multiple missions and providing a suite of computational services that facilitate analysis, comparison, and transformation of that data. The Virtual Science Environment provides climate scientists with a unified web-based destination for discovering relevant observational data in context, and supports a growing range of online tools and services for analyzing and transforming the available data to suit individual research needs. It includes web-based tools to geographically and interactively search for CO2 observations collected from multiple airborne, space, as well as terrestrial platforms. Moreover, the data analysis services it provides over the Internet, including offering techniques such as bias estimation and spatial re-gridding, move computation closer to the data and reduce the complexity of performing these operations repeatedly and at scale. The key to enabling these services, as well as consolidating the disparate data into a unified resource, has been to focus on leveraging metadata descriptors as the foundation of our data environment. This metadata-centric architecture, which leverages the Dublin Core standard, forgoes the need to replicate remote datasets locally. Instead, the system relies upon an extensive, metadata-rich virtual data catalog allowing on-demand browsing and retrieval of CO2 records from multiple missions. In other words, key metadata information about remote CO2 records is stored locally while the data itself is preserved at its respective archive of origin. This strategy has been made possible by our method of encapsulating the heterogeneous sources of data using a common set of web-based services, including services provided by Jet Propulsion Laboratory's Climate Data Exchange (CDX). Furthermore, this strategy has enabled us to scale across missions, and to provide access to a broad array of CO2 observational data. Coupled with on-demand computational services and an intuitive web-portal interface, the CO2 Virtual Science Data Environment effectively transforms heterogeneous CO2 records from multiple sources into a unified resource for scientific discovery.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2011
- Bibcode:
- 2011AGUFMIN33D1486V
- Keywords:
-
- 1912 INFORMATICS / Data management;
- preservation;
- rescue;
- 1946 INFORMATICS / Metadata;
- 1960 INFORMATICS / Portals and user interfaces;
- 1996 INFORMATICS / Web Services