Stability of Maximum likelihood based clustering methods: exploring the backbone of classifications (Who is keeping you in that community?)
Abstract
Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or Maximum Likelihood approaches. In this article, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an Expectation Maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks.
 Publication:

arXiv eprints
 Pub Date:
 September 2008
 arXiv:
 arXiv:0809.1398
 Bibcode:
 2008arXiv0809.1398M
 Keywords:

 Physics  Physics and Society;
 Condensed Matter  Statistical Mechanics;
 Computer Science  Information Theory;
 Physics  Computational Physics;
 Physics  Data Analysis;
 Statistics and Probability
 EPrint:
 19 pages, 9 figures