Separating Structure from Noise in Large Graphs Using the Regularity Lemma
Abstract
How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemerédi's Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called "regular pairs". Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-k graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.
- Publication:
-
Pattern Recognition
- Pub Date:
- February 2020
- DOI:
- 10.1016/j.patcog.2019.107070
- arXiv:
- arXiv:1905.06917
- Bibcode:
- 2020PatRe..9807070F
- Keywords:
-
- Regularity lemma;
- Graph summarization;
- Structural patterns;
- Noise;
- Randomness;
- Graph similarity search;
- Computer Science - Data Structures and Algorithms;
- Computer Science - Social and Information Networks