Clustering with shallow trees
Abstract
We propose a new method for obtaining hierarchical clustering based on the optimization of a cost function over trees of limited depth, and we derive a message-passing method that allows one to use it efficiently. The method and the associated algorithm can be interpreted as a natural interpolation between two well-known approaches, namely that of single linkage and the recently presented affinity propagation. We analyse using this general scheme three biological/medical structured data sets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.
- Publication:
-
Journal of Statistical Mechanics: Theory and Experiment
- Pub Date:
- December 2009
- DOI:
- 10.1088/1742-5468/2009/12/P12010
- arXiv:
- arXiv:0910.0767
- Bibcode:
- 2009JSMTE..12..010B
- Keywords:
-
- Condensed Matter - Disordered Systems and Neural Networks;
- Computer Science - Data Structures and Algorithms;
- Quantitative Biology - Quantitative Methods
- E-Print:
- 11 pages, 7 figures