Performing live time-traversal queries via SPARQL on RDF datasets
Abstract
This article introduces a methodology to perform live time-traversal SPARQL queries on RDF datasets and software based on this methodology that offers a solution to manage the provenance and change-tracking of entities described using RDF. These are crucial factors in ensuring verifiability and trust. Nevertheless, some of the most prominent knowledge bases - including DBpedia, Wikidata, Yago, and the Dynamic Linked Data Observatory - do not support time-agnostic queries, i.e., queries across different snapshots together with provenance information. The OpenCitations Data Model (OCDM) describes one possible way to track provenance and entities' changes in RDF datasets, and it allows restoring an entity to a specific status in time (i.e., a snapshot) by applying SPARQL update queries. The methodology and library presented in this article are based on the rationale introduced in the OCDM. We also developed benchmarks proving that such a procedure is efficient for specific queries and less efficient for others. To the best of our knowledge, our library is the only one to support all the time-related retrieval functionalities live, i.e., enabling real-time searches and updates. Moreover, since OCDM complies with standard RDF, queries are expressed via standard SPARQL.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2022
- DOI:
- 10.48550/arXiv.2210.02534
- arXiv:
- arXiv:2210.02534
- Bibcode:
- 2022arXiv221002534M
- Keywords:
-
- Computer Science - Databases
- E-Print:
- 26 pages, 10 figures, 3 tables, submitted to the Journal of the Association for Information Science and Technology (JASIST)