Logistic Normal Multinomial Factor Analyzers for Clustering Microbiome Data
Abstract
The human microbiome plays an important role in human health and disease status. Next generating sequencing technologies allow for quantifying the composition of the human microbiome. Clustering these microbiome data can provide valuable information by identifying underlying patterns across samples. Recently, Fang and Subedi (2020) proposed a logistic normal multinomial mixture model (LNM-MM) for clustering microbiome data. As microbiome data tends to be high dimensional, here, we develop a family of logistic normal multinomial factor analyzers (LNM-FA) by incorporating a factor analyzer structure in the LNM-MM. This family of models is more suitable for high-dimensional data as the number of parameters in LNM-FA can be greatly reduced by assuming that the number of latent factors is small. Parameter estimation is done using a computationally efficient variant of the alternating expectation conditional maximization algorithm that utilizes variational Gaussian approximations. The proposed method is illustrated using simulated and real datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2021
- DOI:
- 10.48550/arXiv.2101.01871
- arXiv:
- arXiv:2101.01871
- Bibcode:
- 2021arXiv210101871T
- Keywords:
-
- Statistics - Methodology;
- Statistics - Computation;
- 62H30
- E-Print:
- 50 pages, 5 figures