Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models
Abstract
We present an efficient algorithm for the inference of stochastic block models in large networks. The algorithm can be used as an optimized Markov chain Monte Carlo (MCMC) method, with a fast mixing time and a much reduced susceptibility to getting trapped in metastable states, or as a greedy agglomerative heuristic, with an almost linear O (Nln2N) complexity, where N is the number of nodes in the network, independent of the number of blocks being inferred. We show that the heuristic is capable of delivering results which are indistinguishable from the more exact and numerically expensive MCMC method in many artificial and empirical networks, despite being much faster. The method is entirely unbiased towards any specific mixing pattern, and in particular it does not favor assortative community structures.
- Publication:
-
Physical Review E
- Pub Date:
- January 2014
- DOI:
- arXiv:
- arXiv:1310.4378
- Bibcode:
- 2014PhRvE..89a2804P
- Keywords:
-
- 89.75.Hc;
- 02.50.Tt;
- 89.70.Cf;
- Networks and genealogical trees;
- Inference methods;
- Entropy and other measures of information;
- Physics - Data Analysis;
- Statistics and Probability;
- Condensed Matter - Statistical Mechanics;
- Computer Science - Social and Information Networks;
- Physics - Computational Physics;
- Statistics - Machine Learning
- E-Print:
- 9 pages, 9 figures