missForest: Nonparametric missing value imputation using random forest
Abstract
missForest imputes missing values particularly in the case of mixed-type data. It uses a random forest trained on the observed values of a data matrix to predict the missing values. It can be used to impute continuous and/or categorical data including complex interactions and non-linear relations. It yields an out-of-bag (OOB) imputation error estimate without the need of a test set or elaborate cross-validation and can be run in parallel to save computation time. missForest has been used to, among other things, impute variable star colors in an All-Sky Automated Survey (ASAS) dataset of variable stars with no NOMAD match.
- Publication:
-
Astrophysics Source Code Library
- Pub Date:
- May 2015
- Bibcode:
- 2015ascl.soft05011S
- Keywords:
-
- Software