New Data Analysis Capabilities in Ocean Networks Canada's Data Management Platform
Abstract
Ocean Networks Canada (ONC) operates ocean and coastal observatories on all three of Canada's coasts. ONC's software infrastructure, known as Oceans 2.0, controls and acquires data from hundreds of different types of instruments, archives it and presents it to users via hundreds of different file formats, visualizations, interfaces and web services. The quantity of data acquired is approximately 300 GB daily and the total archive is approaching 800 TB. A crucial challenge for ONC is to reduce the data and make it salient and accessible to interested users.
New features have been added to the Oceans 2.0 platform to aid users in completing the scientific data life-cycle: the creation of knowledge from data. As the foremost example, video data is particularly challenging. Video data is very time-consuming and difficult to quantify objectively. An annotation system has been developed with a new adaptable taxonomy system for standard naming of species, morphology, events and more. Annotations can be exported into reports (including video frames or compilations), maps and statistics. Through the new SeaTube video portal, this system enables expert users to annotate live or archived footage. A revised Digital Fishers web application generates crowd-sourced annotations within the new system. Machine learning and automated processes can also generate annotations. Developing and maintaining processing code requires the direct input of scientists interested in the data. However, doing so entirely on a local machine is not feasible. To this end, a "sandbox" has been developed that is collocated with the data and makes use of scalable resources. Users upload their code or executable programs, schedule their jobs, download the processed, reduced data and can submit annotations to the system. The sandbox uses Docker technology and a new Oceans 2.0 API to address security and access concerns. This approach is available for all ONC data and we see great potential to add value to our video, hydrophone, sonar, radar and other complex data types.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMIN33E0892B
- Keywords:
-
- 1902 Community modeling frameworks;
- INFORMATICSDE: 1920 Emerging informatics technologies;
- INFORMATICSDE: 1936 Interoperability;
- INFORMATICSDE: 1998 Workflow;
- INFORMATICS