General Bayesian Loss Function Selection and the use of Improper Models
Abstract
Statisticians often face the choice between using probability models or a paradigm defined by minimising a loss function. Both approaches are useful and, if the loss can be recast into a proper probability model, there are many tools to decide which model or loss is more appropriate for the observed data, in the sense of explaining the data's nature. However, when the loss leads to an improper model, there are no principled ways to guide this choice. We address this task by combining the Hyvärinen score, which naturally targets infinitesimal relative probabilities, and general Bayesian updating, which provides a unifying framework for inference on losses and models. Specifically we propose the Hscore, a general Bayesian selection criterion and prove that it consistently selects the (possibly improper) model closest to the datagenerating truth in Fisher's divergence. We also prove that an associated Hposterior consistently learns optimal hyperparameters featuring in loss functions, including a challenging tempering parameter in generalised Bayesian inference. As salient examples, we consider robust regression and nonparametric density estimation where popular loss functions define improper models for the data and hence cannot be dealt with using standard model selection tools. These examples illustrate advantages in robustnessefficiency tradeoffs and provide a Bayesian implementation for kernel density estimation, opening a new avenue for Bayesian nonparametrics.
 Publication:

arXiv eprints
 Pub Date:
 June 2021
 arXiv:
 arXiv:2106.01214
 Bibcode:
 2021arXiv210601214J
 Keywords:

 Statistics  Methodology
 EPrint:
 Keywords: Loss functions