A Web 2.0 Application for Executing Queries and Services on Climatic Data
Abstract
For many years countries have collected data in order to understand climate, to study its effect in living species, and to predict future behavior. Nowadays, terabytes of data are collected by governmental agencies and academic institutions and the current challenge is how to provide appropriate access to this vast amount of climatic data. Each country has a different situation with respect to the collection and use of these data. In particular, in Venezuela, a few institutions have systematically gathered observational and hidrology data, but the data are mostly registered in non-digital media which have been lost or have deteriorated over the years; all of this restricts data availability. In 2006 a joint project between two major venezuelan universities, Universidad Simón Bolívar (USB) and Universidad Central de Venezuela (UCV) was initiated. The goal of the project is to develop a digital repository of the country's climatic and hidrology data, and to build an application that provides querying and service execution capabilities over these data. The repository has been conceptually modeled as a database, which integrates observational data and metadata. Among the metadata we have an inventory of all the stations where data has been collected, and the description of the measurements themselves, for instance, the instruments used for the collection, the time granularity of the measurements, and their units of measure. The resulting data model combines traditional entity relationship concepts with star and snowflake schemas from datawarehouses. The model allows the inclusion of historic or current data, and each kind of data requires a different loading process. A special emphasis has been given to the representation of the quality of the data stored in the repository. Quality attributes can be attached to each individual value or to sets of values; these attributes can represent statistical or semantic quality of the data. Values can be stored at any level of aggregation, hourly, daily, monthly, so that they can be provided to the user at the desired level. This means that additional caution has to be exercised in query answering, in order to distinguish between primary and derived data. On the other hand, a Web 2.0 application is being designed to provide a front-end to the repository. This design focuses on two important aspects: the use of metadata structures, and the definition of collaborative Web 2.0 features that can be integrated to a project of this nature. Metadata descriptors include for a set of measurements, its quality, granularity and other dimension information. With these descriptors it is possible to establish relationships between different sets of measurements and provide scientists with efficient searching mechanisms that determine the related sets of measurements that contribute to a query answer. Unlike traditional applications for climatic data, our approach not only satisfies requirements of researchers specialized in this domain, but also those of anyone interested in this area; one of the objectives is to build an informal knowledge base that can be improved and consolidated with the usage of the system.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2007
- Bibcode:
- 2007AGUFMIN53B1201A
- Keywords:
-
- 0430 Computational methods and data processing;
- 0434 Data sets;
- 9360 South America;
- 9820 Techniques applicable in three or more fields