Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for smallworld networks
Abstract
Many networks exhibit the smallworld property of the neighborhood connectivity being higher than in comparable random networks. However, the standard measure of local neighborhood clustering is typically not defined if a node has one or no neighbors. In such cases, local clustering has traditionally been set to zero and this value influenced the global clustering coefficient. Such a procedure leads to underestimation of the neighborhood clustering in sparse networks. We propose to include θ as the proportion of leafs and isolated nodes to estimate the contribution of these cases and provide a formula for estimating a clustering coefficient excluding these cases from the Watts and Strogatz (1998 Nature 393 4402) definition of the clustering coefficient. Excluding leafs and isolated nodes leads to values which are up to 140% higher than the traditional values for the observed networks indicating that neighborhood connectivity is normally underestimated. We find that the definition of the clustering coefficient has a major effect when comparing different networks. For metabolic networks of 43 organisms, relations changed for 58% of the comparisons when a different definition was applied. We also show that the definition influences smallworld features and that the classification can change from nonsmallworld to smallworld network. We discuss the use of an alternative measure, disconnectedness D, which is less influenced by leafs and isolated nodes.
 Publication:

New Journal of Physics
 Pub Date:
 August 2008
 DOI:
 10.1088/13672630/10/8/083042
 arXiv:
 arXiv:0802.2512
 Bibcode:
 2008NJPh...10h3042K
 Keywords:

 Physics  Physics and Society;
 Quantitative Biology  Molecular Networks
 EPrint:
 final version of the manuscript