Data Management Plans and Planning for Scientists in the Earth and Space Sciences
Abstract
In recent years, Earth and space science funding agencies worldwide have come to understand that publication of funded scientific results is only part of their mandate, they must also preserve and provide access to the data from which those results were derived. As a result, many new data systems and data archives have been created. Scientists are now frequently required to archive the data that they acquire, and to describe their archive plans in a data management plan (DMP) as part of the proposal process. Here we present the elements of a good data management plan, regardless of the funding agency. These elements include: beginning the planning process early, selecting the right archive, understanding the standards and practices of that archive, developing a realistic work plan (and budget), and allowing schedule margin for inevitable slippage. A key to creating a quality DMP is to start planning early and leave plenty of time to write the DMP. Even scientists who are experts in the archiving process may find that the archive they have delivered data to in the past has modernized or otherwise updated their standards or practices. Often it is not obvious where data should be archived, or even if an appropriate data system exists. Once the target archive or data system has been determined, it is important to learn that system's standards and practices, including acceptable data formats, metadata requirements, nomenclature, and any additional required documentation. A clear understanding of the system requirements is required in order to develop a credible work plan and budget for the funding agency. The work plan should include a list of tasks, due dates, and should identify who will perform each task. Data must be documented clearly and completely. The documentation task should be assigned to a scientist who fully understands the data, its acquisition, and its processing and calibration. Some archives require external review of data and documentation. If required, the plan should include time for both the review and the response to the reviewer's comments. Even if an external review is not required, having a Co-Investigator or colleague review the data set makes sense. Data preparers are often so close to the data that they forget to document some of the details of the acquisition or processing that are required for others to use the data correctly. Some archives require or encourage inclusion of software used in the data processing. Well-documented software can be included with the archive documentation and/or uploaded to a public software repository such as GitHub (or NASA GitHub for NASA planetary data). Archiving should immediately follow data processing and not deferred to the end of the performance period. Documenting and archiving the data promptly insures that details of the process will not be lost in the interim. All of these components should be covered in the DMP. Lastly, given the long list of items that need to be determined prior to writing and then discussed in the DMP the value of starting the process early cannot be over stated.
- Publication:
-
42nd COSPAR Scientific Assembly
- Pub Date:
- July 2018
- Bibcode:
- 2018cosp...42E1642J