Data Relationships: Towards a Conceptual Model of Scientific Data Catalogs
Abstract
As the amount of data, types of processing and storage formats increase, the total number of record permutations increase dramatically. The result is an overwhelming number of records that make identifying the best data object to answer a user's needs more difficult. The issue is further complicated as each archive's data catalog may be designed around different concepts - - anything from individual files to be served, series of similarly generated and processed data, or something entirely different. Catalogs may not only be flat tables, but may be structured as multiple tables with each table being a different data series, or a normalized structure of the individual data files. Merging federated search results from archives with different catalog designs can create situations where the data object of interest is difficult to find due to an overwhelming number of seemingly similar or entirely unwanted records. We present a reference model for discussing data catalogs and the complex relationships between similar data objects. We show how the model can be used to improve scientist's ability to quickly identify the best data object for their purposes and discuss technical issues required to use this model in a federated system.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2008
- Bibcode:
- 2008AGUFMIN22A..03H
- Keywords:
-
- 0525 Data management;
- 6339 System design;
- 9810 New fields (not classifiable under other headings);
- 9820 Techniques applicable in three or more fields