Data Type Registry - Cross Road Between Catalogs, Data And Semantics
Abstract
As more data become accessible online, the opportunity is increasing to improve search for information within datasets and for automating some levels of data integration. A prerequisite for these advances is indexing the kinds of information that are present in datasets and providing machine actionable descriptions of data structures. We are exploring approaches to enabling these capabilities in the EarthCube DigitalCrust and Data Discovery Hub Building Block projects, building on the Data type registry (DTR) workgroup activity in the Research Data Alliance. We are prototyping a registry implementation using the CNRI Cordra platform and API to enable 'deep registration' of datasets for building hydrogeologic models of the Earth's Crust, and executing complex science scenarios for river chemistry and coral bleaching data. These use cases require the ability to respond to queries such as: What are properties of Entity X; What entities include property Y (or L, M, N…), and What DataTypes are about Entity X and include property Y. Development of the registry to enable these capabilities requires more in-depth metadata than is commonly available, so we are also exploring approaches to analyzing simple tabular data to automate recognition of entities and properties, and assist users with establishing semantic mappings to data integration vocabularies. This poster will review the current capabilities and implementation of a data type registry.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2017
- Bibcode:
- 2017AGUFMIN32A..06R
- Keywords:
-
- 1916 Data and information discovery;
- INFORMATICS;
- 1920 Emerging informatics technologies;
- INFORMATICS;
- 1982 Standards;
- INFORMATICS;
- 1998 Workflow;
- INFORMATICS