Variable Selection Methods for Model-based Clustering
Abstract
Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2017
- DOI:
- 10.48550/arXiv.1707.00306
- arXiv:
- arXiv:1707.00306
- Bibcode:
- 2017arXiv170700306F
- Keywords:
-
- Statistics - Methodology;
- Statistics - Applications;
- Statistics - Machine Learning
- E-Print:
- Statistics Surveys, 12 (2018) 1-48