Application of Open Source Technologies for Oceanographic Data Analysis
Abstract
NEXUS is a data-intensive analysis solution developed with a new approach for handling science data that enables large-scale data analysis by leveraging open source technologies such as Apache Cassandra, Apache Spark, Apache Solr, and Webification. NEXUS has been selected to provide on-the-fly time-series and histogram generation for the Soil Moisture Active Passive (SMAP) mission for Level 2 and Level 3 Active, Passive, and Active Passive products. It also provides an on-the-fly data subsetting capability. NEXUS is designed to scale horizontally, enabling it to handle massive amounts of data in parallel. It takes a new approach on managing time and geo-referenced array data by dividing data artifacts into chunks and stores them in an industry-standard, horizontally scaled NoSQL database. This approach enables the development of scalable data analysis services that can infuse and leverage the elastic computing infrastructure of the Cloud. It is equipped with a high-performance geospatial and indexed data search solution, coupled with a high-performance data Webification solution free from file I/O bottlenecks, as well as a high-performance, in-memory data analysis engine. In this talk, we will focus on the recently funded AIST 2014 project by using NEXUS as the core for oceanographic anomaly detection service and web portal. We call it, OceanXtremes
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2015
- Bibcode:
- 2015AGUFMIN51B1810H
- Keywords:
-
- 1908 Cyberinfrastructure;
- INFORMATICS;
- 1916 Data and information discovery;
- INFORMATICS;
- 1920 Emerging informatics technologies;
- INFORMATICS;
- 1926 Geospatial;
- INFORMATICS