Incorporating Spontaneous Reporting System Data to Aid Causal Inference in Longitudinal Healthcare Data
Abstract
Inferring causality using longitudinal observational databases is challenging due to the passive way the data are collected. The majority of associations found within longitudinal observational data are often non-causal and occur due to confounding. The focus of this paper is to investigate incorporating information from additional databases to complement the longitudinal observational database analysis. We investigate the detection of prescription drug side effects as this is an example of a causal relationship. In previous work a framework was proposed for detecting side effects only using longitudinal data. In this paper we combine a measure of association derived from mining a spontaneous reporting system database to previously proposed analysis that extracts domain expertise features for causal analysis of a UK general practice longitudinal database. The results show that there is a significant improvement to the performance of detecting prescription drug side effects when the longitudinal observation data analysis is complemented by incorporating additional drug safety sources into the framework. The area under the receiver operating characteristic curve (AUC) for correctly classifying a side effect when other data were considered was 0.967, whereas without it the AUC was 0.923 However, the results of this paper may be biased by the evaluation and future work should overcome this by developing an unbiased reference set.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2015
- DOI:
- 10.48550/arXiv.1502.05938
- arXiv:
- arXiv:1502.05938
- Bibcode:
- 2015arXiv150205938R
- Keywords:
-
- Computer Science - Computational Engineering;
- Finance;
- and Science
- E-Print:
- IEEE International Conference of Data Mining: The Fifth Workshop on Biological Data Mining and its Applications in Healthcare, 2014