On 2-Clubs in Graph-Based Data Clustering: Theory and Algorithm Engineering
Abstract
Editing a graph into a disjoint union of clusters is a standard optimization task in graph-based data clustering. Here, complementing classic work where the clusters shall be cliques, we focus on clusters that shall be 2-clubs, that is, subgraphs of diameter two. This naturally leads to the two NP-hard problems 2-Club Cluster Editing (the allowed editing operations are edge insertion and edge deletion) and 2-Club Cluster Vertex Deletion (the allowed editing operations are vertex deletions). Answering an open question from the literature, we show that 2-Club Cluster Editing is W[2]-hard with respect to the number of edge modifications, thus contrasting the fixed-parameter tractability result for the classic Cluster Editing problem (considering cliques instead of 2-clubs). Then focusing on 2-Club Cluster Vertex Deletion, which is easily seen to be fixed-parameter tractable, we show that under standard complexity-theoretic assumptions it does not have a polynomial-size problem kernel when parameterized by the number of vertex deletions. Nevertheless, we develop several effective data reduction and pruning rules, resulting in a competitive solver, clearly outperforming a standard CPLEX solver in most instances of an established biological test data set.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2020
- DOI:
- arXiv:
- arXiv:2006.14972
- Bibcode:
- 2020arXiv200614972F
- Keywords:
-
- Computer Science - Data Structures and Algorithms