Scalable and Robust Community Detection with Randomized Sketching

doi:10.48550/arXiv.1805.10927

Scalable and Robust Community Detection with Randomized Sketching

This article explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph's adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the full graph are inferred based on the clusters extracted from the sketch using a correlation-based retrieval step. Uniform random node sampling is shown to improve the computational complexity over clustering of the full graph when the cluster sizes are balanced. A new random degree-based node sampling algorithm is presented which significantly improves upon the performance of the clustering algorithm even when clusters are unbalanced. This framework improves the phase transitions for matrix-decomposition-based clustering with regard to computational complexity and minimum cluster size, which are shown to be nearly dimension-free in the low inter-cluster connectivity regime. A third sampling technique is shown to improve balance by randomly sampling nodes based on spatial distribution. We provide analysis and numerical results using a convex clustering algorithm based on matrix completion.

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.10927

arXiv:

arXiv:1805.10927

Bibcode:

2018arXiv180510927R

Keywords:

Computer Science - Social and Information Networks;
Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

IEEE Transactions on Signal Processing, vol. 68, pp. 962-977, 2020

NASA/ADS

Scalable and Robust Community Detection with Randomized Sketching

Abstract