Concept Tree: High-Level Representation of Variables for More Interpretable Surrogate Decision Trees

doi:10.48550/arXiv.1906.01297

Concept Tree: High-Level Representation of Variables for More Interpretable Surrogate Decision Trees

Interpretable surrogates of black-box predictors trained on high-dimensional tabular datasets can struggle to generate comprehensible explanations in the presence of correlated variables. We propose a model-agnostic interpretable surrogate that provides global and local explanations of black-box classifiers to address this issue. We introduce the idea of concepts as intuitive groupings of variables that are either defined by a domain expert or automatically discovered using correlation coefficients. Concepts are embedded in a surrogate decision tree to enhance its comprehensibility. First experiments on FRED-MD, a macroeconomic database with 134 variables, show improvement in human-interpretability while accuracy and fidelity of the surrogate model are preserved.

Publication:

arXiv e-prints

Pub Date:

June 2019

DOI:

10.48550/arXiv.1906.01297

arXiv:

arXiv:1906.01297

Bibcode:

2019arXiv190601297R

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning

E-Print:

presented at 2019 ICML Workshop on Human in the Loop Learning (HILL 2019), Long Beach, USA

NASA/ADS

Concept Tree: High-Level Representation of Variables for More Interpretable Surrogate Decision Trees

Abstract