A Bernoulli Mixture Model to Understand and Predict Children Longitudinal Wheezing Patterns
Abstract
In this research, we estimate that around $27.99(\pm2.15)\%$ of the population has experienced wheezing before turning 1 in the United Kingdom. Furthermore, the Bernoulli Mixture Model classification is found to work best with $K=4$ clusters in order to better balance the separability of the clusters with their explanatory nature, based on a cohort of $N=1184$. The probability of the group of parents in the $j$th cluster to say that their children have wheezed during the $i$th age is assumed $P_{ij} \sim \text{Beta}(1/2, 1/2)$, the probabilities of assignment to each cluster is $R \sim \text{Dirichlet}_K(\alpha)$, the assignment of the $n$th patient to each cluster is $Z_n\ |\ R \sim \text{Categorical}(R)$, and the $n$th patient wheezed during the $i$th age is $X_{in}\ |\ P_{ij}, Z_n \sim \text{Bernoulli}(P_{i,Z_n})$; where $i\in\{1,\dots,6\}$, $j\in\{1,\dots,K\}$, and $n\in\{1,\dots, N\}$. The classification is then performed through the E-M optimization algorithm. We found that this clustering method groups efficiently the patients with late-childhood wheezing, persistent wheezing, early-childhood wheezing, and none or sporadic wheezing. Furthermore, we found that this method is not dependent on the data-set, and can include data-sets with missing entries.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2020
- DOI:
- 10.48550/arXiv.2005.02931
- arXiv:
- arXiv:2005.02931
- Bibcode:
- 2020arXiv200502931M
- Keywords:
-
- Statistics - Applications