The Case for Adopting Server-side Analytics
Abstract
The standard method for accessing Earth and space science data relies on a scheme developed decades ago: data residing in one or many data stores must be parsed out and shipped via internet lines or physical transport to the researcher who in turn locally stores the data for analysis. The analyses tasks are varied and include visualization, parameterization, and comparison with or assimilation into physics models. In many cases this process is inefficient and unwieldy as the data sets become larger and demands on the analysis tasks become more sophisticated and complex. For about a decade, several groups have explored a new paradigm to this model. The names applied to the paradigm include "data analytics", "climate analytics", and "server-side analytics". The general concept is that in close network proximity to the data store there will be a tailored processing capability appropriate to the type and use of the data served. The user of the server-side analytics will operate on the data with numerical procedures. The procedures can be accessed via canned code, a scripting processor, or an analysis package such as Matlab, IDL or R. Results of the analytics processes will then be relayed via the internet to the user. In practice, these results will be at a much lower volume, easier to transport to and store locally by the user and easier for the user to interoperate with data sets from other remote data stores. The user can also iterate on the processing call to tailor the results as needed. A major component of server-side analytics could be to provide sets of tailored results to end users in order to eliminate the repetitive preconditioning that is both often required with these data sets and which drives much of the throughput challenges. NASA's Big Data Task Force studied this issue. This paper will present the results of this study including examples of SSAs that are being developed and demonstrated and suggestions for architectures that might be developed for future applications.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2017
- Bibcode:
- 2017AGUFMIN31A0064T
- Keywords:
-
- 1916 Data and information discovery;
- INFORMATICS;
- 1920 Emerging informatics technologies;
- INFORMATICS;
- 1932 High-performance computing;
- INFORMATICS;
- 1998 Workflow;
- INFORMATICS