Random Bits Regression: a Strong General Predictor for Big Data
Abstract
To improve accuracy and speed of regressions and classifications, we present a data-based prediction method, Random Bits Regression (RBR). This method first generates a large number of random binary intermediate/derived features based on the original input matrix, and then performs regularized linear/logistic regression on those intermediate/derived features to predict the outcome. Benchmark analyses on a simulated dataset, UCI machine learning repository datasets and a GWAS dataset showed that RBR outperforms other popular methods in accuracy and robustness. RBR (available on https://sourceforge.net/projects/rbr/) is very fast and requires reasonable memories, therefore, provides a strong, robust and fast predictor in the big data era.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.02990
- arXiv:
- arXiv:1501.02990
- Bibcode:
- 2015arXiv150102990W
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- 20 pages,1 figure, 2 tables, research article