High Dimensional Model Representation as a Glass Box in Supervised Machine Learning
Abstract
Prediction and explanation are key objects in supervised machine learning, where predictive models are known as black boxes and explanatory models are known as glass boxes. Explanation provides the necessary and sufficient information to interpret the model output in terms of the model input. It includes assessments of model output dependence on important input variables and measures of input variable importance to model output. High dimensional model representation (HDMR), also known as the generalized functional ANOVA expansion, provides useful insight into the input-output behavior of supervised machine learning models. This article gives applications of HDMR in supervised machine learning. The first application is characterizing information leakage in ``big-data'' settings. The second application is reduced-order representation of elementary symmetric polynomials. The third application is analysis of variance with correlated variables. The last application is estimation of HDMR from kernel machine and decision tree black box representations. These results suggest HDMR to have broad utility within machine learning as a glass box representation.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2018
- DOI:
- arXiv:
- arXiv:1807.10320
- Bibcode:
- 2018arXiv180710320D
- Keywords:
-
- Statistics - Methodology;
- Statistics - Machine Learning
- E-Print:
- 54 pages, 23 figures, 5 tables