Data Conservancy: Infrastructure for Data Re-Use and Sharing
Abstract
The Data Conservancy (DC) embraces a shared vision: scientific data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society. DC has explicitly focused on a diverse range of data, disciplines and communities to maximize the potential for data re-use and sharing, particularly in unanticipated ways or by new communities. Rather than rely on a road map, DC has embraced the concept of principles of navigation outlined in socio-technical research related to infrastructure development. One of the key principles of navigation is that preservation fosters re-use. Many of the actions required to prepare data for ingestion and preservation resonate with or reinforce eventual data re-use and sharing. DC's architecture and infrastructure development are consistent with the Open Archival Information System (OAIS) reference model for digital archiving. DC's approach not only ensures a robust, full-fledged preservation framework but also supports rich, nuanced discovery, access and sharing capabilities and policies for a range of diverse data. For example, rather than assume encumbered data (e.g., endangered species location) is either available or unavailable, DC has considered approaches such as "data fuzzing" or the deliberate obfuscation of location to balance utility and confidentiality. Most recently, DC has developed the concept of a Data Conservancy Instance (DCI) that comprises hardware, software, staffing, policies and a sustainability model to support data re-use and sharing for designated communities. DC is planning to launch three DCIs by July 2012 at Colorado University/National Snow and Ice Data Center, the Marine Biological Laboratory and Johns Hopkins University. In subsequent years, these DCIs will work as an interoperable network providing "key integrators" such as data replication or query capabilities (e.g., geo-spatial, taxonomic) across the DCIs. Additionally, the DCIs will features APIs that will facilitate interoperability with other data infrastructure nodes within the broader cyberinfrastructure landscape.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2011
- Bibcode:
- 2011AGUFMED51D..03C
- Keywords:
-
- 0800 EDUCATION;
- 1908 INFORMATICS / Cyberinfrastructure;
- 1976 INFORMATICS / Software tools and services