Phase transitions in semisupervised clustering of sparse networks
Abstract
Predicting labels of nodes in a network, such as community memberships or demographic variables, is an important problem with applications in social and biological networks. A recently discovered phase transition puts fundamental limits on the accuracy of these predictions if we have access only to the network topology. However, if we know the correct labels of some fraction α of the nodes, we can do better. We study the phase diagram of this semisupervised learning problem for networks generated by the stochastic block model. We use the cavity method and the associated belief propagation algorithm to study what accuracy can be achieved as a function of α . For k =2 groups, we find that the detectability transition disappears for any α >0 , in agreement with previous work. For larger k where a hard but detectable regime exists, we find that the easy/hard transition (the point at which efficient algorithms can do better than chance) becomes a line of transitions where the accuracy jumps discontinuously at a critical value of α . This line ends in a critical point with a secondorder transition, beyond which the accuracy is a continuous function of α . We demonstrate qualitatively similar transitions in two realworld networks.
 Publication:

Physical Review E
 Pub Date:
 November 2014
 DOI:
 10.1103/PhysRevE.90.052802
 arXiv:
 arXiv:1404.7789
 Bibcode:
 2014PhRvE..90e2802Z
 Keywords:

 64.60.aq;
 64.60.De;
 89.20.a;
 02.10.Ox;
 Networks;
 Statistical mechanics of model systems;
 Interdisciplinary applications of physics;
 Combinatorics;
 graph theory;
 Computer Science  Social and Information Networks;
 Condensed Matter  Statistical Mechanics;
 Physics  Physics and Society;
 Statistics  Machine Learning
 EPrint:
 Phys. Rev. E 90, 052802 (2014)