Property Graph Schema Optimization for Domain-Specific Knowledge Graphs
Abstract
Enterprises are creating domain-specific knowledge graphs by curating and integrating their business data from multiple sources. The data in these knowledge graphs can be described using ontologies, which provide a semantic abstraction to define the content in terms of the entities and the relationships of the domain. The rich semantic relationships in an ontology contain a variety of opportunities to reduce edge traversals and consequently improve the graph query performance. Although there has been a lot of effort to build systems that enable efficient querying over knowledge graphs, the problem of schema optimization for query performance has been largely ignored in the graph setting. In this work, we show that graph schema design has significant impact on query performance, and then propose optimization algorithms that exploit the opportunities from the domain ontology to generate efficient property graph schemas. To the best of our knowledge, we are the first to present an ontology-driven approach for property graph schema optimization. We conduct empirical evaluations with two real-world knowledge graphs from medical and financial domains. The results show that the schemas produced by the optimization algorithms achieve up to 2 orders of magnitude speed-up compared to the baseline approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2020
- DOI:
- 10.48550/arXiv.2003.11580
- arXiv:
- arXiv:2003.11580
- Bibcode:
- 2020arXiv200311580L
- Keywords:
-
- Computer Science - Databases
- E-Print:
- To appear in ICDE 2021