Correlation Clustering with Overlap: a Heuristic Graph Editing Approach
Abstract
Correlation clustering seeks a partition of the vertex set of a given graph/network into groups of closely related, or just close enough, vertices so that elements of different groups are not close to each other. The problem has been previously modeled and studied as a graph editing problem, namely Cluster Editing, which assumes that closely related data elements must be adjacent. As such, the main objective (of the Cluster Editing problem) is to turn clusters into cliques as a way to identify them. This is to be obtained via two main edge editing operations: additions and deletions. There are two problems with the Cluster Editing model that we seek to address in this paper. First, ``closely'' related does not necessarily mean ``directly'' related. So closeness should be measured by relatively short distance. As such, we seek to turn clusters into (sub)graphs of small diameter. Second, in real applications, a data element can belong, or have roles, in multiple groups. In some cases, without allowing data elements to belong to more than one cluster each, makes it hard to achieve any clustering via classical partition-based methods. We address this latter problem by allowing vertex cloning, also known as vertex splitting. Heuristic methods for the introduced problem are presented along with experimental results showing the effectiveness of the proposed model and algorithmic approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- DOI:
- arXiv:
- arXiv:2412.02704
- Bibcode:
- 2024arXiv241202704A
- Keywords:
-
- Computer Science - Social and Information Networks