Variable Selection in Covariate Dependent Random Partition Models: an Application to Urinary Tract Infection
Abstract
Lower urinary tract symptoms (LUTS) can indicate the presence of urinary tract infection (UTI), a condition that if it becomes chronic requires expensive and time consuming care as well as leading to reduced quality of life. Detecting the presence and gravity of an infection from the earliest symptoms is then highly valuable. Typically, white blood cell count (WBC) measured in a sample of urine is used to assess UTI. We consider clinical data from 1341 patients at their first visit in which UTI (i.e. WBC$\geq 1$) is diagnosed. In addition, for each patient, a clinical profile of 34 symptoms was recorded. In this paper we propose a Bayesian nonparametric regression model based on the Dirichlet Process (DP) prior aimed at providing the clinicians with a meaningful clustering of the patients based on both the WBC (response variable) and possible patterns within the symptoms profiles (covariates). This is achieved by assuming a probability model for the symptoms as well as for the response variable. To identify the symptoms most associated to UTI, we specify a spike and slab base measure for the regression coefficients: this induces dependence of symptoms selection on cluster assignment. Posterior inference is performed through Markov Chain Monte Carlo methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.03537
- arXiv:
- arXiv:1501.03537
- Bibcode:
- 2015arXiv150103537B
- Keywords:
-
- Statistics - Applications
- E-Print:
- Revised version. 24 pages, 6 figures