Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for small-world networks
Many networks exhibit the small-world property of the neighborhood connectivity being higher than in comparable random networks. However, the standard measure of local neighborhood clustering is typically not defined if a node has one or no neighbors. In such cases, local clustering has traditionally been set to zero and this value influenced the global clustering coefficient. Such a procedure leads to underestimation of the neighborhood clustering in sparse networks. We propose to include θ as the proportion of leafs and isolated nodes to estimate the contribution of these cases and provide a formula for estimating a clustering coefficient excluding these cases from the Watts and Strogatz (1998 Nature 393 440-2) definition of the clustering coefficient. Excluding leafs and isolated nodes leads to values which are up to 140% higher than the traditional values for the observed networks indicating that neighborhood connectivity is normally underestimated. We find that the definition of the clustering coefficient has a major effect when comparing different networks. For metabolic networks of 43 organisms, relations changed for 58% of the comparisons when a different definition was applied. We also show that the definition influences small-world features and that the classification can change from non-small-world to small-world network. We discuss the use of an alternative measure, disconnectedness D, which is less influenced by leafs and isolated nodes.