The CUAHSI Water Data Center: Empowering scientists to discover, use, store, and share water data
Abstract
The proposed CUAHSI Water Data Center (WDC) will provide production-quality water data resources based upon the successful large-scale data services prototype developed by the CUAHSI Hydrologic Information System (HIS) project. The WDC, using the HIS technology, concentrates on providing time series data collected at fixed points or on moving platforms from sensors primarily (but not exclusively) in the medium of water. The WDC's missions include providing simple and effective data discovery tools useful to researchers in a variety of water-related disciplines, and providing simple and cost-effective data publication mechanisms for projects that do not desire to run their own data servers. The WDC's activities will include: 1. Rigorous curation of the water data catalog already assembled during the CUAHSI HIS project, to ensure accuracy of records and existence of declared sources. 2. Data backup and failover services for "at risk" data sources. 3. Creation and support for ubiquitously accessible data discovery and access, web-based search and smartphone applications. 4. Partnerships with researchers to extend the state of the art in water data use. 5. Partnerships with industry to create plug-and-play data publishing from sensors, and to create domain-specific tools. The WDC will serve as a knowledge resource for researchers of water-related issues, and will interface with other data centers to make their data more accessible to water researchers. The WDC will serve as a vehicle for addressing some of the grand challenges of accessing and using water data, including: a. Cross-domain data discovery: different scientific domains refer to the same kind of water data using different terminologies, making discovery of data difficult for researchers outside the data provider's domain. b. Cross-validation of data sources: much water data comes from sources lacking rigorous quality control procedures; such sources can be compared against others with rigorous quality control. The WDC enables this by making both kinds of sources available in the same search interface. c. Data provenance: the appropriateness of data for use in a specific model or analysis often depends upon the exact details of how data was gathered and processed. The WDC will aid this by curating standards for metadata that are as descriptive as practical of the collection procedures. "Plug and play" sensor interfaces will fill in metadata appropriate to each sensor without human intervention. d. Contextual search: discovering data based upon geological (e.g. aquifer) or geographic (e.g., location in a stream network) features external to metadata. e. Data-driven search: discovering data that exhibit quality factors that are not described by the metadata. The WDC will partner with researchers desiring contextual and data driven search, and make results available to all. Many major data providers (e.g. federal agencies) are not mandated to provide access to data other than those they collect. The HIS project assembled data from over 90 different sources, thus demonstrating the promise of this approach. Meeting the grand challenges listed above will greatly enhance scientists' ability to discover, interpret, access, and analyze water data from across domains and sources to test Earth system hypotheses.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2012
- Bibcode:
- 2012AGUFMED41A0669C
- Keywords:
-
- 1899 HYDROLOGY / General or miscellaneous;
- 1904 INFORMATICS / Community standards;
- 1916 INFORMATICS / Data and information discovery;
- 1996 INFORMATICS / Web Services