Streaming Data Cyberinfrastructure for Global-Local-Global Analysis
Abstract
Streaming data collected from an ever-expanding number of data sources, such as sensors, social media, and IoT devices, to name a few, provides new opportunity for cross domain and multi-scale modeling and analysis which is critical in addressing the grand challenges associated with food and water security resulting from a growing population, frequent natural disasters, and changing climate. However, significant challenges exist for data producers and individual researchers to collect, manage, and use streaming data because such data are heterogeneous and often lack standards in format and access protocols; they come continuously and in large volumes. In this presentation we will introduce StreamCI, a flexible and scalable cyberinfrastructure solution that helps researchers to easily collect, manage, process, and access streaming data in real time. StreamCI is built on an open-source software stack including RabbitMQ, node.js, MongoDB, Grafana, and HUBzero. The backend of StreamCI is deployed on Purdue's composable system called Geddes which provides scalable operation using Docker containers and Kubernetes autoscaling services, and it can be deployed on other cyberinfrastructures. We will present several applications using the system to collect and process crop sensor data, water, and air quality sensor data, as well as data from IoT devices and advanced manufacturing data. StreamCI is well positioned as a general streaming data solution to help the global-to-local-to-global (GLG) community with their data management needs.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2022
- Bibcode:
- 2022AGUFMGC52I0254S