To cluster, or not to cluster: An analysis of clusterability methods
Abstract
Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.
- Publication:
-
Pattern Recognition
- Pub Date:
- April 2019
- DOI:
- 10.1016/j.patcog.2018.10.026
- arXiv:
- arXiv:1808.08317
- Bibcode:
- 2019PatRe..88...13A
- Keywords:
-
- Clusterability;
- Cluster structure;
- Cluster tendency;
- Dimension reduction;
- Multimodality tests;
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- 30 pages, 3 figures, 10 tables