A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem
Abstract
Many graph mining applications rely on detecting subgraphs which are near-cliques. There exists a dichotomy between the results in the existing work related to this problem: on the one hand the densest subgraph problem (DSP) which maximizes the average degree over all subgraphs is solvable in polynomial time but for many networks fails to find subgraphs which are near-cliques. On the other hand, formulations that are geared towards finding near-cliques are NP-hard and frequently inapproximable due to connections with the Maximum Clique problem. In this work, we propose a formulation which combines the best of both worlds: it is solvable in polynomial time and finds near-cliques when the DSP fails. Surprisingly, our formulation is a simple variation of the DSP. Specifically, we define the triangle densest subgraph problem (TDSP): given $G(V,E)$, find a subset of vertices $S^*$ such that $\tau(S^*)=\max_{S \subseteq V} \frac{t(S)}{|S|}$, where $t(S)$ is the number of triangles induced by the set $S$. We provide various exact and approximation algorithms which the solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to the more general problem of maximizing the $k$-clique average density. Finally, we provide empirical evidence that the TDSP should be used whenever the output of the DSP fails to output a near-clique.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2014
- DOI:
- 10.48550/arXiv.1405.1477
- arXiv:
- arXiv:1405.1477
- Bibcode:
- 2014arXiv1405.1477T
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Discrete Mathematics;
- Computer Science - Social and Information Networks
- E-Print:
- 42 pages