Posterior Predictive P-values with Fisher Randomization Tests in Noncompliance Settings: Test Statistics vs Discrepancy Variables
Abstract
In randomized experiments with noncompliance, tests may focus on compliers rather than on the overall sample. Rubin (1998) put forth such a method, and argued that testing for the complier average causal effect and averaging permutation based p-values over the posterior distribution of the compliance status could increase power, as compared to general intent-to-treat tests. The general scheme is to repeatedly do a two-step process of imputing missing compliance statuses and conducting a permutation test with the completed data. In this paper, we explore this idea further, comparing the use of discrepancy measures, which depend on unknown but imputed parameters, to classical test statistics and exploring different approaches for imputing the unknown compliance statuses. We also examine consequences of model misspecification in the imputation step, and discuss to what extent this additional modeling undercuts the permutation test's model independence. We find that, especially for discrepancy measures, modeling choices can impact both power and validity. In particular, imputing missing compliance statuses assuming the null can radically reduce power, but not doing so can jeopardize validity. Fortunately, covariates predictive of compliance status can mitigate these results. Finally, we compare this overall approach to Bayesian model-based tests, that is tests that are directly derived from posterior credible intervals, under both correct and incorrect model specification. We find that adding the permutation step in an otherwise Bayesian approach improves robustness to model specification without substantial loss of power.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2015
- DOI:
- 10.48550/arXiv.1511.00521
- arXiv:
- arXiv:1511.00521
- Bibcode:
- 2015arXiv151100521F
- Keywords:
-
- Statistics - Methodology