Mixture-based Multiple Imputation Model for Clinical Data with a Temporal Dimension
Abstract
The problem of missing values in multivariable time series is a key challenge in many applications such as clinical data mining. Although many imputation methods show their effectiveness in many applications, few of them are designed to accommodate clinical multivariable time series. In this work, we propose a multiple imputation model that capture both cross-sectional information and temporal correlations. We integrate Gaussian processes with mixture models and introduce individualized mixing weights to handle the variance of predictive confidence of Gaussian process models. The proposed model is compared with several state-of-the-art imputation algorithms on both real-world and synthetic datasets. Experiments show that our best model can provide more accurate imputation than the benchmarks on all of our datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2019
- DOI:
- arXiv:
- arXiv:1908.04209
- Bibcode:
- 2019arXiv190804209X
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- doi:10.1109/BigData47090.2019.9005672