Computing Bayesian Model Evidence for high resolution data-sets using the Method of Forced Probabilities

Computing Bayesian Model Evidence for high resolution data-sets using the Method of Forced Probabilities

Bayesian model selection objectively ranks competing models of different structure and with different parameters based on a common calibration data set. This technique requires the evaluation of Bayesian Model Evidence (BME). BME is the likelihood of the data to occur under the assumed model, averaged over its parameter space. Exact and fast analytical solutions for BME exist only with strong assumptions. Mathematical approximations via information criteria claim to be more generally applicable but suffer from strong biases in real-world applications. Numerical methods, like Monte Carlo techniques, do not rely on any assumptions but require high computational effort. This becomes prohibitive if the data set is very large, e.g., highly resolved in space and time like experimental movies or images in time. To still enable the use of BME as a probabilistic and rigorous model performance metric in such cases, we have developed the Method of Forced Probabilities (MFP). The core idea is to swap the direction of evaluation: instead of comparing thousands of forward model runs on random parameter realizations with the observed data, we force the model to reproduce the data during each time step and record the individual probabilities of the model performing these exact transitions. This method is a fast and accurate way to compute BME for models that predict time series and fulfill the Markov Chain property in time, paired with high-quality atomic event-type data sets. As a test case for demonstration, we apply the method on invasion percolation models which simulate multiphase flow in porous media. The corresponding highly resolved data set was obtained from an experiment of a slow gas injection into water-saturated, homogeneous sand in a 25 cm x 25 cm acrylic glass cell. Images were obtained at a rate of 30 images per second using the light transmission technique. Despite the image series not always satisfying the required high-quality demands, we are still able to apply the MFP by suggesting workarounds. Results confirm that the proposed method enables data assimilation, by evaluating BME, in previously unfeasible scenarios. Obtained BME values for varied modeling assumptions can then be used to gain insights for future model improvement.

Publication:: AGU Fall Meeting Abstracts
Pub Date:: December 2021
Bibcode:: 2021AGUFMNG21A..04W

NASA/ADS