Data Flow for the TERRA-REF project
Abstract
The Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) program aims to identify crop traits that are best suited to producing high-energy sustainable biofuels and match those plant characteristics to their genes to speed the plant breeding process. One tool used to achieve this goal is a high-throughput phenotyping robot outfitted with sensors and cameras to monitor the growth of 1.25 acres of sorghum. Data types range from hyperspectral imaging to 3D reconstructions and thermal profiles, all at 1mm resolution. This system produces thousands of daily measurements with high spatiotemporal resolution. The team at NCSA processes, annotates, organizes and stores the massive amounts of data produced by this system - up to 5 TB per day. Data from the sensors is streamed to a local gantry-cache server. The standardized sensor raw data stream is automatically and securely delivered to NCSA using Globus Connect service. Once files have been successfully received by the Globus endpoint, the files are removed from the gantry-cache server. As each dataset arrives or is created the Clowder system automatically triggers different software tools to analyze each file, extract information, and convert files to a common format. Other tools can be triggered to run after all required data is uploaded. For example, a stitched image of the entire field is created after all images of the field become available. Some of these tools were developed by external collaborators based on predictive models and algorithms, others were developed as part of other projects and could be leveraged by the TERRA project. Data will be stored for the lifetime of the project and is estimated to reach 10 PB over 3 years. The Clowder system, BETY and other systems will allow users to easily find data by browsing or searching the extracted information.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2017
- Bibcode:
- 2017AGUFMIN31A0063K
- Keywords:
-
- 1916 Data and information discovery;
- INFORMATICS;
- 1920 Emerging informatics technologies;
- INFORMATICS;
- 1932 High-performance computing;
- INFORMATICS;
- 1998 Workflow;
- INFORMATICS