Combining experimental and observational data through a power likelihood
Abstract
Randomized controlled trials are the gold standard for causal inference and play a pivotal role in modern evidence-based medicine. However, the sample sizes they use are often too limited to draw significant causal conclusions for subgroups that are less prevalent in the population. In contrast, observational data are becoming increasingly accessible in large volumes but can be subject to bias as a result of hidden confounding. Given these complementary features, we propose a power likelihood approach to augmenting RCTs with observational data to improve the efficiency of treatment effect estimation. We provide a data-adaptive procedure for maximizing the expected log predictive density (ELPD) to select the learning rate that best regulates the information from the observational data. We validate our method through a simulation study that shows increased power while maintaining an approximate nominal coverage rate. Finally, we apply our method in a real-world data fusion study augmenting the PIONEER 6 clinical trial with a US health claims dataset, demonstrating the effectiveness of our method and providing detailed guidance on how to address practical considerations in its application.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- arXiv:
- arXiv:2304.02339
- Bibcode:
- 2023arXiv230402339L
- Keywords:
-
- Statistics - Methodology;
- Mathematics - Statistics Theory