Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

doi:10.48550/arXiv.1809.02397

Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

Machine learning models are increasingly used in the industry to make decisions such as credit insurance approval. Some people may be tempted to manipulate specific variables, such as the age or the salary, in order to get better chances of approval. In this ongoing work, we propose to discuss, with a first proposition, the issue of detecting a potential local adversarial example on classical tabular data by providing to a human expert the locally critical features for the classifier's decision, in order to control the provided information and avoid a fraud.

Publication:

arXiv e-prints

Pub Date:

September 2018

DOI:

10.48550/arXiv.1809.02397

arXiv:

arXiv:1809.02397

Bibcode:

2018arXiv180902397R

Keywords:

Statistics - Machine Learning;
Computer Science - Cryptography and Security;
Computer Science - Machine Learning

E-Print:

presented at 2018 ECML/PKDD Workshop on Recent Advances in Adversarial Machine Learning (Nemesis 2018), Dublin, Ireland

NASA/ADS

Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

Abstract