NED: An InterGraph Node Metric Based On Edit Distance
Abstract
Node similarity is a fundamental problem in graph analytics. However, node similarity between nodes in different graphs (intergraph nodes) has not received a lot of attention yet. The intergraph node similarity is important in learning a new graph based on the knowledge of an existing graph (transfer learning on graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring intergraph node similarity with edit distance, called NED. In NED, two nodes are compared according to their local neighborhood structures which are represented as unordered kadjacent trees, without relying on labels or other assumptions. Since the computation problem of tree edit distance on unordered trees is NPComplete, we propose a modified tree edit distance, called TED*, for comparing neighborhood trees. TED* is a metric distance, as the original tree edit distance, but more importantly, TED* is polynomially computable. As a metric distance, NED admits efficient indexing, provides interpretable results, and shows to perform better than existing approaches on a number of data analysis tasks, including graph deanonymization. Finally, the efficiency and effectiveness of NED are empirically demonstrated using realworld graphs.
 Publication:

arXiv eprints
 Pub Date:
 February 2016
 arXiv:
 arXiv:1602.02358
 Bibcode:
 2016arXiv160202358Z
 Keywords:

 Computer Science  Databases;
 Computer Science  Machine Learning;
 Computer Science  Social and Information Networks