Evaluating Sensitivity to the Stick Breaking Prior in Bayesian Nonparametrics
Abstract
A central question in many probabilistic clustering problems is how many distinct clusters are present in a particular dataset. A Bayesian nonparametric (BNP) model addresses this question by placing a generative process on cluster assignment. However, like all Bayesian approaches, BNP requires the specification of a prior. In practice, it is important to quantitatively establish that the prior is not too informative, particularly when the particular form of the prior is chosen for mathematical convenience rather than because of a considered subjective belief. We derive local sensitivity measures for a truncated variational Bayes (VB) approximation and approximate nonlinear dependence of a VB optimum on prior parameters using a local Taylor series approximation. Using a stickbreaking representation of a Dirichlet process, we consider perturbations both to the scalar concentration parameter and to the functional form of the stick breaking distribution. Unlike previous work on local Bayesian sensitivity for BNP, we pay special attention to the ability of our sensitivity measures to extrapolate to different priors, rather than treating the sensitivity as a measure of robustness per se. Extrapolation motivates the use of multiplicative perturbations to the functional form of the prior for VB. Additionally, we linearly approximate only the computationally intensive part of inference  the optimization of the global parameters  and retain the nonlinearity of easily computed quantities as functions of the global parameters. We apply our methods to estimate sensitivity of the expected number of distinct clusters present in the Iris dataset to the BNP prior specification. We evaluate the accuracy of our approximations by comparing to the much more expensive process of refitting the model.
 Publication:

arXiv eprints
 Pub Date:
 October 2018
 arXiv:
 arXiv:1810.06587
 Bibcode:
 2018arXiv181006587L
 Keywords:

 Statistics  Methodology
 EPrint:
 8 pages, 6 figures