Clustering by Deep Nearest Neighbor Descent (D-NND): A Density-based Parameter-Insensitive Clustering Method
Abstract
Most density-based clustering methods largely rely on how well the underlying density is estimated. However, density estimation itself is also a challenging problem, especially the determination of the kernel bandwidth. A large bandwidth could lead to the over-smoothed density estimation in which the number of density peaks could be less than the true clusters, while a small bandwidth could lead to the under-smoothed density estimation in which spurious density peaks, or called the "ripple noise", would be generated in the estimated density. In this paper, we propose a density-based hierarchical clustering method, called the Deep Nearest Neighbor Descent (D-NND), which could learn the underlying density structure layer by layer and capture the cluster structure at the same time. The over-smoothed density estimation could be largely avoided and the negative effect of the under-estimated cases could be also largely reduced. Overall, D-NND presents not only the strong capability of discovering the underlying cluster structure but also the remarkable reliability due to its insensitivity to parameters.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2015
- DOI:
- 10.48550/arXiv.1512.02097
- arXiv:
- arXiv:1512.02097
- Bibcode:
- 2015arXiv151202097Q
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Statistics - Computation;
- Statistics - Methodology
- E-Print:
- 28 pages, 14 figures