Bayesian Uncertainty Estimation Under Complex Sampling
Abstract
Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilized by federal statistical agencies are typically constructed to maximize the efficiency of the target domain level estimator (e.g., indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically-indexed clusters of units. A sampling-weighted pseudo-posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between co-clustered units inflates the scale of the resulting pseudo-posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model mispecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo-MLE, the pseudo-posterior and the MLE under simple random sampling. Through insights from survey sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast post-processing step to MCMC draws of the pseudo-posterior distribution. This adjustment projects the pseudo-posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2018
- DOI:
- arXiv:
- arXiv:1807.11796
- Bibcode:
- 2018arXiv180711796W
- Keywords:
-
- Statistics - Methodology;
- 62D05;
- 62F15;
- 62F12
- E-Print:
- 45 pages, 4 figures, 1 table