Lifting Interpretability-Performance Trade-off via Automated Feature Engineering
Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of interpretable models require more time-consuming work related to feature engineering. Can we train interpretable and accurate models, without timeless feature engineering? We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted with the help of a surrogate model. We supply the analysis by a large-scale benchmark on several tabular data sets from the OpenML database. There are two results 1) extracting information from complex models may improve the performance of linear models, 2) questioning a common myth that complex machine learning models outperform linear models.
- Pub Date:
- February 2020
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- 12 pages, 5 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:1902.11035