Towards an Ontology for the Global Geodynamics Project: Automated Extraction of Resource Descriptions from an XML-Based Data Model
Abstract
Using the Earth Science Markup Language (ESML), an XML-based data model for the Global Geodynamics Project (GGP) was recently introduced [Lumb & Aldridge, Proc. HPCS 2005, Kotsireas & Stacey, eds., IEEE, 2005, 216-222]. This data model possesses several key attributes -i.e., it: makes use of XML schema; supports semi-structured ASCII format files; includes Earth Science affinities; and is on track for compliance with emerging Grid computing standards (e.g., the Global Grid Forum's Data Format Description Language, DFDL). Favorable attributes notwithstanding, metadata (i.e., data about data) was identified [Lumb & Aldridge, 2005] as a key challenge for progress in enabling the GGP for Grid computing. Even in projects of small-to-medium scale like the GGP, the manual introduction of metadata has the potential to be the rate-determining metric for progress. Fortunately, an automated approach for metadata introduction has recently emerged. Based on Gleaning Resource Descriptions from Dialects of Languages (GRDDL, http://www.w3.org/2004/01/rdxh/spec), this bottom-up approach allows for the extraction of Resource Description Format (RDF) representations from the XML-based data model (i.e., the ESML representation of GGP data) subject to rules of transformation articulated via eXtensible Stylesheet Language Transformations (XSLT). In addition to introducing relationships into the GGP data model, and thereby addressing the metadata requirement, the syntax and semantics of RDF comprise a requisite for a GGP ontology - i.e., ``the common words and concepts (the meaning) used to describe and represent an area of knowledge'' [Daconta et al., The Semantic Web, Wiley, 2003]. After briefly reviewing the XML-based model for the GGP, attention focuses on the automated extraction of an RDF representation via GRDDL with XSLT-delineated templates. This bottom-up approach, in tandem with a top-down approach based on the Protege integrated development environment for ontologies (http://protege.stanford.edu/), allows for initial scoping of an ontology for the GGP. Such ontological approaches are key to enabling the use of formerly specific-purpose GGP data into broader systems and frameworks such as those demanded by current challenges in tsunami research following the devastating 26 December 2004 Sumatra-Andaman event.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2005
- Bibcode:
- 2005AGUFMIN43A0328L
- Keywords:
-
- 9810 New fields (not classifiable under other headings);
- 9820 Techniques applicable in three or more fields