Appropriate reduction of the posterior distribution in fully Bayesian inversions
Abstract
Bayesian inversion generates a posterior distribution of model parameters from an observation equation and prior information both weighted by hyperparameters. The prior is also introduced for the hyperparameters in fully Bayesian inversions and enables us to evaluate both the model parameters and hyperparameters probabilistically by the joint posterior. However, even in a linear inverse problem, it is unsolved how we should extract useful information on the model parameters from the joint posterior. This study presents a theoretical exploration into the appropriate dimensionality reduction of the joint posterior in the fully Bayesian inversion. We classify the ways of probability reduction into the following three categories focused on the marginalization of the joint posterior: (1) using the joint posterior without marginalization, (2) using the marginal posterior of the model parameters and (3) using the marginal posterior of the hyperparameters. First, we derive several analytical results that characterize these categories. One is a suite of semianalytic representations of the probability maximization estimators for respective categories in the linear inverse problem. The mode estimators of categories (1) and (2) are found asymptotically identical for a large number of data and model parameters. We also prove the asymptotic distributions of categories (2) and (3) deltafunctionally concentrate on their probability peaks, which predicts two distinct optimal estimates of the model parameters. Secondly, we conduct a synthetic test and find an appropriate reduction is realized by category (3), typified by Akaike's Bayesian information criterion. The other reduction categories are shown inappropriate for the case of many model parameters, where the probability concentration of the marginal posterior of the model parameters no longer implies the central limit theorem. The main cause of these results is that the joint posterior peaks sharply at an underfitted or overfitted solution as the number of model parameters increases. The exponential growth of the probability space in the modelparameter dimension makes almostzeroprobability events finitely contribute to the posterior mean and distributions of categories (1) and (2) be pathological. One remedy for this pathology is counting all modelparameter realizations by integrating the joint posterior over the modelparameter space of exponential multiplicity. Hence, the marginal posterior of the hyperparameters for categories (3) becomes appropriate and can conform to the law of large numbers even with numerous model parameters. The exponential rarity of the posterior mean and ABIC estimates implies the exponential time complexity of ordinary Monte Carlo methods in population mean and ABIC computations. We also present a geophysical application to estimate a continuous strainrate field from spatially discrete global navigation satellite system data, demonstrating denser basis function expansions of the modelparameter field lead to oversmoothed estimates in naive fully Bayesian approaches, while detailed fields are resolved with convergence by the reduction of category (3). We often naively believe a good solution can be constructed from a finite number of samples with high probabilities, but the highprobability domain could be inappropriate, and exponentially many samples become necessary for generating appropriate estimates in the highdimensional fully Bayesian posterior probability space.
 Publication:

Geophysical Journal International
 Pub Date:
 November 2022
 DOI:
 10.1093/gji/ggac231
 arXiv:
 arXiv:2205.07559
 Bibcode:
 2022GeoJI.231..950S
 Keywords:

 Inverse theory;
 Probability distributions;
 Spatial analysis;
 Statistical methods;
 Physics  Geophysics;
 Mathematics  Statistics Theory
 EPrint:
 70 pages, 10 figures