Publishing Linked Open Data for Physical Samples - Lessons Learned
Abstract
Most data and information about physical samples and associated sampling features currently reside in relational databases. Integrating common concepts from various databases has motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). The goal of our work is threefold: To evaluate and select ontologies in different granularities for common concepts; to establish best practices and develop a generic methodology for publishing physical sample data stored in relational database as Linked Open Data; and to reuse standard community vocabularies from the International Commission on Stratigraphy (ICS), Global Volcanism Program (GVP), General Bathymetric Chart of the Oceans (GEBCO), and others. Our work leverages developments in the EarthCube GeoLink project and the Interdisciplinary Earth Data Alliance (IEDA) facility for modeling and extracting physical sample data stored in relational databases. Reusing ontologies developed by GeoLink and IEDA has facilitated discovery and integration of data and information across multiple collections including the USGS National Geochemical Database (NGDB), System for Earth Sample Registration (SESAR), and Index to Marine & Lacustrine Geological Samples (IMLGS). We have evaluated, tested, and deployed Linked Open Data tools including Morph, Virtuoso Server, LodView, LodLive, and YASGUI for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Using persistent identifiers such as Open Researcher & Contributor IDs (ORCIDs) and International Geo Sample Numbers (IGSNs) at the record level makes it possible for other repositories to link related resources such as persons, datasets, documents, expeditions, awards, etc. to samples, features, and collections. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2016
- Bibcode:
- 2016AGUFMPA51B2263J
- Keywords:
-
- 1912 Data management;
- preservation;
- rescue;
- INFORMATICSDE: 1946 Metadata;
- INFORMATICSDE: 1982 Standards;
- INFORMATICSDE: 6699 General or miscellaneous;
- PUBLIC ISSUES