Connecting community data repositories for discovery and sharing by leveraging schema.org and JSON-LD
Abstract
Growing interest in leveraging web architecture and web standards to expose structured metadata describing data set resources is evident in the work of the EarthCube Project 418 [1] and also interest by Google [2], DataONE, and others [3]. Typical use cases have involved exposing structured data for access and aggregation for use in large scale discovery systems.
Here, we will present results of the above mentioned Project 418 and offer an additional contrasting use case for structured metadata data. The patterns will be shown being employed to provide resources for small communities of practice. Open Core Data, IODP Site Survey Data Bank (SSDB), IEDA, SESAR and EarthRef MagIC holdings will be used to highlight the distinct, yet complementary, nature of data within these systems. To better explore this potential we will prototype some uses cases around: - Connections based on PIDs such as IGSNs for samples - Shared spatial features such as drill sites - Temporal connection around geologic time We will show extension patterns in JSON-LD that mix domain ontologies with the shared upper level vocabulary of schema.org, and demonstrate usage of Shape Constraint Language (SHACL) shape graphs to evaluate conformance of exposed data graphs to established conventions for interoperability. These SHACL shape graphs are a resource that can be authored by the community to define expected vocabulary use to be found in graphs around these shared nodes of interest. Once collected, approaches such as the JSON-LD Framing API will be employed to extract elements for indexing and linking. These indexes will be exposed via services that allow the participants to filter results for use in their local interfaces. Alternatively, the indexes can be directly downloaded by the facilities for local use. By building on practices that have been adopted in the commercial search engine community, the science community can benefit from improved search results in widely used search engines and use of domain-specific extensions enable value-added capabilities. Simultaneously, the same patterns and content can be leveraged to improve the linking and sharing across communities of practice. [1] https://www.earthcube.org/group/project-418 [2] https://ai.googleblog.com/2017/01/facilitating-discovery-of-public.html [3] https://bit.ly/2ODwHSy- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMIN24B..01F
- Keywords:
-
- 1904 Community standards;
- INFORMATICSDE: 1908 Cyberinfrastructure;
- INFORMATICSDE: 1916 Data and information discovery;
- INFORMATICSDE: 1936 Interoperability;
- INFORMATICS