A Statistical Model for Motifs Detection

doi:10.48550/arXiv.1511.05254

A Statistical Model for Motifs Detection

We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.' Within this model, one or multiple copies of a subgraph is added (`planted') in an Erdős-Renyi random graph with $n$ vertices and edge probability $q_0$. We ask whether the resulting graph can be distinguished reliably from a pure Erdős-Renyi random graph, and we present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is any test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for small enough subgraphs. Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph. The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the more sophisticated SDP approach.

Publication:

arXiv e-prints

Pub Date:

November 2015

DOI:

10.48550/arXiv.1511.05254

arXiv:

arXiv:1511.05254

Bibcode:

2015arXiv151105254J

Keywords:

Mathematics - Statistics Theory;
Computer Science - Discrete Mathematics;
Computer Science - Information Theory

E-Print:

40 pages, 1 pdf figure

NASA/ADS

A Statistical Model for Motifs Detection

Abstract