Spatially-Aware Comparison and Consensus for Clusterings
Abstract
This paper proposes a new distance metric between clusterings that incorporates information about the spatial distribution of points and clusters. Our approach builds on the idea of a Hilbert space-based representation of clusters as a combination of the representations of their constituent points. We use this representation and the underlying metric to design a spatially-aware consensus clustering procedure. This consensus procedure is implemented via a novel reduction to Euclidean clustering, and is both simple and efficient. All of our results apply to both soft and hard clusterings. We accompany these algorithms with a detailed experimental evaluation that demonstrates the efficiency and quality of our techniques.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2011
- DOI:
- 10.48550/arXiv.1102.0026
- arXiv:
- arXiv:1102.0026
- Bibcode:
- 2011arXiv1102.0026R
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computational Geometry;
- Computer Science - Databases
- E-Print:
- 12 Pages, 9 figures, Proceedings of 2011 Siam International Conference on Data Mining