(Un)detectable Cluster Structure in Sparse Networks
Abstract
Can a cluster structure in a sparse relational data set, i.e., a network, be detected at all by unsupervised clustering techniques? We answer this question by means of statistical mechanics making our analysis independent of any particular algorithm used for clustering. We find a sharp transition from a phase in which the cluster structure is not detectable at all to a phase in which it can be detected with high accuracy. We calculate the transition point and the shape of the transition, i.e., the theoretically achievable accuracy, analytically. This illuminates theoretical limitations of data mining in networks and allows for an understanding and evaluation of the performance of a variety of algorithms.
- Publication:
-
Physical Review Letters
- Pub Date:
- August 2008
- DOI:
- arXiv:
- arXiv:0711.1452
- Bibcode:
- 2008PhRvL.101g8701R
- Keywords:
-
- 89.75.Hc;
- 05.50.+q;
- 89.65.-s;
- Networks and genealogical trees;
- Lattice theory and statistics;
- Social and economic systems;
- Condensed Matter - Disordered Systems and Neural Networks;
- Condensed Matter - Statistical Mechanics
- E-Print:
- 4 Pages, 2 Figures