The Role of Shared Information Models for Software Reuse in Cross-Disciplinary Data Systems
Abstract
A shared information model is vital for enabling correlative science, data system interoperability, and effective cross-discipline search. The use of common terminology enables scientists to communicate more precisely about their data and machines to inter-operate at levels far above the simple exchange of data structures. Furthermore research has shown that a shared information model is important for developing successful cross-disciplinary systems since the attempt to harmonize disparate information models is essentially cryptography. The Apache Object Oriented Data Technology (OODT) software framework developed at the Jet Propulsion Laboratory (JPL) supports the development of cross-disciplinary science data systems following an open source implementation. This framework has been applied to various areas in earth, planetary, lunar, astrophysics and biomedical scientific research. In addition multi-institutional and international data systems have resulted by developing around this software product line. To enable this versatility, appropriate architectural boundaries had to be observed that separate the common data management components from the discipline specific requirements. A shared information model can capture most of the discipline specific requirements by formally and unambiguously defining the science discipline's concepts and their relationships. It is developed using a knowledge acquisition process and software such as an ontology modeling tool. The contents of the information model can be extracted to produce the artifacts necessary for configuring the various data management components of the system. This paper will present the Planetary Data System (PDS) Information Model as a use case and describes how it functions as the source for the majority of the discipline specific requirements for the PDS4 data system. The information model, as a knowledge base of common terms, attributes, and classes, is used to generate XML Schema for data collection, validation, and preservation, query models for conventional and semantic search, and configuration files that provide a generic registry the necessary information to classify, associate, and index the product types to be managed by the PDS. Using a formal knowledge acquisition process, a team of information technology and planetary science experts has taken about two years to create the core information model for a cross-disciplinary data system.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2011
- Bibcode:
- 2011AGUFMIN21D..06H
- Keywords:
-
- 1900 INFORMATICS;
- 1920 INFORMATICS / Emerging informatics technologies;
- 1958 INFORMATICS / Ontologies;
- 1978 INFORMATICS / Software re-use