From random walks to distances on unweighted graphs
Abstract
Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited. We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably wellbehaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of nongeometric edges. Tests on simulated and realworld data show that the LTHT matches theoretical predictions and outperforms alternatives.
 Publication:

arXiv eprints
 Pub Date:
 November 2015
 arXiv:
 arXiv:1511.00573
 Bibcode:
 2015arXiv151100573H
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Artificial Intelligence;
 Computer Science  Social and Information Networks
 EPrint:
 To appear in NIPS 2015