Absorbing randomwalk centrality: Theory and algorithms
Abstract
We study a new notion of graph centrality based on absorbing random walks. Given a graph $G=(V,E)$ and a set of query nodes $Q\subseteq V$, we aim to identify the $k$ most central nodes in $G$ with respect to $Q$. Specifically, we consider central nodes to be absorbing for random walks that start at the query nodes $Q$. The goal is to find the set of $k$ central nodes that minimizes the expected length of a random walk until absorption. The proposed measure, which we call $k$ absorbing randomwalk centrality, favors diverse sets, as it is beneficial to place the $k$ absorbing nodes in different parts of the graph so as to "intercept" random walks that start from different query nodes. Although similar problem definitions have been considered in the literature, e.g., in informationretrieval settings where the goal is to diversify websearch results, in this paper we study the problem formally and prove some of its properties. We show that the problem is NPhard, while the objective function is monotone and supermodular, implying that a greedy algorithm provides solutions with an approximation guarantee. On the other hand, the greedy algorithm involves expensive matrix operations that make it prohibitive to employ on large datasets. To confront this challenge, we develop more efficient algorithms based on spectral clustering and on personalized PageRank.
 Publication:

arXiv eprints
 Pub Date:
 September 2015
 arXiv:
 arXiv:1509.02533
 Bibcode:
 2015arXiv150902533M
 Keywords:

 Computer Science  Social and Information Networks;
 Computer Science  Data Structures and Algorithms
 EPrint:
 11 pages, 11 figures, short paper to appear at ICDM 2015