All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)
Abstract
Intensive care clinicians need reliable clinical practice tools to preempt unexpected critical events that might harm their patients in intensive care units (ICU), to pre-plan timely interventions, and to keep the patient's family well informed. The conventional statistical models are built by curating only a limited number of key variables, which means a vast unknown amount of potentially precious data remains unused. Deep learning models (DLMs) can be leveraged to learn from large complex datasets and construct predictive clinical tools. This retrospective study was performed using 42,818 hospital admissions involving 35,348 patients, which is a subset of the MIMIC-III dataset. Natural language processing (NLP) techniques were applied to build DLMs to predict in-hospital mortality (IHM) and length of stay >=7 days (LOS). Over 75 million events across multiple data sources were processed, resulting in over 355 million tokens. DLMs for predicting IHM using data from all sources (AS) and chart data (CS) achieved an AUC-ROC of 0.9178 and 0.9029, respectively, and PR-AUC of 0.6251 and 0.5701, respectively. DLMs for predicting LOS using AS and CS achieved an AUC-ROC of 0.8806 and 0.8642, respectively, and PR-AUC of 0.6821 and 0.6575, respectively. The observed AUC-ROC difference between models was found to be significant for both IHM and LOS at p=0.05. The observed PR-AUC difference between the models was found to be significant for IHM and statistically insignificant for LOS at p=0.05. In this study, deep learning models were constructed using data combined from a variety of sources in Electronic Health Records (EHRs) such as chart data, input and output events, laboratory values, microbiology events, procedures, notes, and prescriptions. It is possible to predict in-hospital mortality with much better confidence and higher reliability from models built using all sources of data.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2020
- DOI:
- 10.48550/arXiv.2009.01366
- arXiv:
- arXiv:2009.01366
- Bibcode:
- 2020arXiv200901366R
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- 18 pages, 9 Figures