Planted Hitting Set Recovery in Hypergraphs
Abstract
In various application areas, networked data is collected by measuring interactions involving some specific set of core nodes. This results in a network dataset containing the core nodes along with a potentially much larger set of fringe nodes that all have at least one interaction with a core node. In many settings, this type of data arises for structures that are richer than graphs, because they involve the interactions of larger sets; for example, the core nodes might be a set of individuals under surveillance, where we observe the attendees of meetings involving at least one of the core individuals. We model such scenarios using hypergraphs, and we study the problem of core recovery: if we observe the hypergraph but not the labels of core and fringe nodes, can we recover the "planted" set of core nodes in the hypergraph? We provide a theoretical framework for analyzing the recovery of such a set of core nodes and use our theory to develop a practical and scalable algorithm for core recovery. The crux of our analysis and algorithm is that the core nodes are a hitting set of the hypergraph, meaning that every hyperedge has at least one node in the set of core nodes. We demonstrate the efficacy of our algorithm on a number of realworld datasets, outperforming competitive baselines derived from network centrality and coreperiphery measures.
 Publication:

arXiv eprints
 Pub Date:
 May 2019
 arXiv:
 arXiv:1905.05839
 Bibcode:
 2019arXiv190505839A
 Keywords:

 Computer Science  Social and Information Networks;
 Computer Science  Machine Learning;
 Statistics  Machine Learning
 EPrint:
 10 pages