Standardizing Metadata Quality Review for an Environmental Data Repository
Abstract
The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE), is a data repository developed to support earth and environmental science projects funded by the U.S. Department of Energy (DOE), and is part of the DataONE network. One of the challenges ESS-DIVE faces is ensuring that submitted data packages have thorough metadata necessary to find and use the dataset.
Our goal is to ensure all data packages published on ESS-DIVE have high-quality metadata that meet FAIR data principles. However, extensive metadata quality reviews can involve significant staff time and resources. Therefore, we implement a combination of automated checks to catch issues upon submission, and a manual process for in-depth content reviews requiring domain knowledge. A majority of the automated checks were developed by DataONE and the Arctic Data Center and are designed to assess the findability, accessibility, interoperability and reusability (FAIR-ness) of datasets by checking for the presence of metadata fields and word counts. We are testing this suite of DataONE metadata checks as well as additional checks needed for our community. Automated checks reduce the time needed for manual reviews and provide instant feedback to users, thus expediting the publication process. To standardize the manual review process and provide consistent feedback to dataset authors, we use a checklist form with specific requirements for each metadata element. Completed forms for each dataset enable tracking the quality of datasets before and after review, and the amount of time taken on the review process. We have found that the combination of automated quality reports and specific guidance in the review process is an effective approach to improve metadata and reduce manual review time. In addition, data from the completed review forms will allow us to assess whether the automated checks have decreased the manual review time and improved metadata quality.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2019
- Bibcode:
- 2019AGUFMIN14B..09K
- Keywords:
-
- 1908 Cyberinfrastructure;
- INFORMATICS;
- 1912 Data management;
- preservation;
- rescue;
- INFORMATICS;
- 1930 Data and information governance;
- INFORMATICS;
- 1934 International collaboration;
- INFORMATICS