IoT-Hub: A new cloud data-platform for monitoring and analyzing IoT
Abstract
This work presents IoT-Hub a new scalable, elastic, efficient, and portable Internet of Things (IoT) data-platform based on microservices for monitoring and analysing large-scale sensor data in real-time. The current implementation of IoT-Hub includes a service-pipeline composed by Apache Kafka, Apache Spark, Elasticsearch, Cassandra, and Kibana middleware that allows for automated gathering, preprocessing, storing, and visualization of IoT streams. All middleware has been containerized, which enables flexible and agile development, and deployment in cloud-based infrastructures.
We have demonstrated the feasibility of IoT-Hub via a real use case application, called Environmental Baseline Monitoring programme, from the British Geological Survey (BGS). This programme represents the first independent, integrated monitoring study to characterize the environmental baseline in the areas of Lancashire and Vale of Pickering (England, UK) to close scrutiny in anticipation of the development of a nascent UK shale-gas industry. The monitoring involves a way range of sensors, including Groundwater quality, Seismicity and Air composition. We have initially focused on Groundwater quality sensors, which provide measurements of water-quality parameters. But, very little work has to be done in IoT-Hub to enable support to other sensors. For our experiments, we have used the NSF-Chameleon cloud, using a CentOS7 image with 42-CPUS. The framework could be deployed to any other Cloud systems. IoT-Hub ingests water-quality streams through Apache Kafka, and it stores them in Cassandra database (for persistent storage) and Elasticsearch (for indexing and searching engine). For testing the capacity of IoT-Hub to run complex data-analysis, we have created a warning system to interpret sensors in the field (such as battery failures or nature of seasonal patterns). The warning system consists in a Apache Spark application that implements the Seasonal Hybrid Extreme Studentized Deviate algorithm. It queries data periodically from Cassandra database and detects the anomalies of each of the water-quality parameters. IoT-Hub could be used to act as a backend for a Virtual Research Environment. As future work, we plan to include more middleware in IoT-Hub, such as an RDF repository, SparQL Endpoint, and Jupyter Notebook.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMIN51B0586F
- Keywords:
-
- 1908 Cyberinfrastructure;
- INFORMATICSDE: 1920 Emerging informatics technologies;
- INFORMATICSDE: 1932 High-performance computing;
- INFORMATICSDE: 1976 Software tools and services;
- INFORMATICS