An Asymptotic Equivalence between the Mean-Shift Algorithm and the Cluster Tree
Abstract
Two important nonparametric approaches to clustering emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hosteler. In a recent paper, we argue the thesis that these two approaches are fundamentally the same by showing that the gradient flow provides a way to move along the cluster tree. In making a stronger case, we are confronted with the fact the cluster tree does not define a partition of the entire support of the underlying density, while the gradient flow does. In the present paper, we resolve this conundrum by proposing two ways of obtaining a partition from the cluster tree -- each one of them very natural in its own right -- and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2021
- DOI:
- 10.48550/arXiv.2111.10298
- arXiv:
- arXiv:2111.10298
- Bibcode:
- 2021arXiv211110298A
- Keywords:
-
- Mathematics - Statistics Theory;
- Computer Science - Machine Learning