iRODS-Based Climate Data Services and Virtualization-as-a-Service in the NASA Center for Climate Simulation
Abstract
Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service. A virtual climate data server is an OAIS-compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have developed prototype vCDSs to manage NetCDF, HDF, and GeoTIF data products. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA's Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into these virtualized resources, multiple vCDSs can use iRODS's federation and realized object capabilities to create an integrated ecosystem of data servers that can scale and adapt to changing requirements. This approach enables platform- or software-as-a-service deployment of the vCDSs and allows the NCCS to offer virtualization-as-a-service, a capacity to respond in an agile way to new customer requests for data services, and a path for migrating existing services into the cloud. We have registered MODIS Atmosphere data products in a vCDS that contains 54 million registered files, 630TB of data, and over 300 million metadata values. We are now assembling IPCC AR5 data into a production vCDS that will provide the platform upon which NCCS's Earth System Grid (ESG) node publishes to the extended science community. In this talk, we describe our approach, experiences, lessons learned, and plans for the future.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2011
- Bibcode:
- 2011AGUFMIN41A1397S
- Keywords:
-
- 1912 INFORMATICS / Data management;
- preservation;
- rescue;
- 1920 INFORMATICS / Emerging informatics technologies;
- 1976 INFORMATICS / Software tools and services