Efficiently Finding a Maximal Clique Summary via Effective Sampling
Abstract
Maximal clique enumeration (MCE) is a fundamental problem in graph theory and is used in many applications, such as social network analysis, bioinformatics, intelligent agent systems, cyber security, etc. Most existing MCE algorithms focus on improving the efficiency rather than reducing the output size. The output unfortunately could consist of a large number of maximal cliques. In this paper, we study how to report a summary of less overlapping maximal cliques. The problem was studied before, however, after examining the pioneer approach, we consider it still not satisfactory. To advance the research along this line, our paper attempts to make four contributions: (a) we propose a more effective sampling strategy, which produces a much smaller summary but still ensures that the summary can somehow witness all the maximal cliques and the expectation of each maximal clique witnessed by the summary is above a predefined threshold; (b) we prove that the sampling strategy is optimal under certain optimality conditions; (c) we apply clique-size bounding and design new enumeration order to approach the optimality conditions; and (d) to verify experimentally, we test eight real benchmark datasets that have a variety of graph characteristics. The results show that our new sampling strategy consistently outperforms the state-of-the-art approach by producing smaller summaries and running faster on all the datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2020
- DOI:
- 10.48550/arXiv.2009.10376
- arXiv:
- arXiv:2009.10376
- Bibcode:
- 2020arXiv200910376L
- Keywords:
-
- Computer Science - Databases;
- H.2
- E-Print:
- 20 pages, 80 figures