Metric recovery from directed unweighted graphs
Abstract
We analyze directed, unweighted graphs obtained from $x_i\in \mathbb{R}^d$ by connecting vertex $i$ to $j$ iff $x_i  x_j < \epsilon(x_i)$. Examples of such graphs include $k$nearest neighbor graphs, where $\epsilon(x_i)$ varies from point to point, and, arguably, many real world graphs such as copurchasing graphs. We ask whether we can recover the underlying Euclidean metric $\epsilon(x_i)$ and the associated density $p(x_i)$ given only the directed graph and $d$. We show that consistent recovery is possible up to isometric scaling when the vertex degree is at least $\omega(n^{2/(2+d)}\log(n)^{d/(d+2)})$. Our estimator is based on a careful characterization of a random walk over the directed graph and the associated continuum limit. As an algorithm, it resembles the PageRank centrality metric. We demonstrate empirically that the estimator performs well on simulated examples as well as on realworld copurchasing graphs even with a small number of points and degree scaling as low as $\log(n)$.
 Publication:

arXiv eprints
 Pub Date:
 November 2014
 arXiv:
 arXiv:1411.5720
 Bibcode:
 2014arXiv1411.5720H
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Social and Information Networks;
 Mathematics  Statistics Theory;
 Statistics  Methodology
 EPrint:
 Poster at NIPS workshop on networks. Submitted to AISTATS 2015