Sample size and power calculation for propensity score analysis of observational studies
Abstract
This paper develops theoretically justified analytical formulas for sample size and power calculation in the propensity score analysis of causal inference using observational data. By analyzing the variance of the inverse probability weighting estimator of the average treatment effect (ATE), we clarify the three key components for sample size calculations: propensity score distribution, potential outcome distribution, and their correlation. We devise analytical procedures to identify these components based on commonly available and interpretable summary statistics. We elucidate the critical role of covariate overlap between treatment groups in determining the sample size. In particular, we propose to use the Bhattacharyya coefficient as a measure of covariate overlap, which, together with the treatment proportion, leads to a uniquely identifiable and easily computable propensity score distribution. The proposed method is applicable to both continuous and binary outcomes. We show that the standard two-sample $z$-test and variance inflation factor methods often lead to, sometimes vastly, inaccurate sample size estimates, especially with limited overlap. We also derive formulas for the average treatment effects for the treated (ATT) and overlapped population (ATO) estimands. We provide simulated and real examples to illustrate the proposed method. We develop an associated R package PSpower.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.11181
- Bibcode:
- 2025arXiv250111181L
- Keywords:
-
- Statistics - Methodology