SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical Difference Measure

doi:10.48550/arXiv.2005.13166

SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical Difference Measure

Ensuring safety and explainability of machine learning (ML) is a topic of increasing relevance as data-driven applications venture into safety-critical application domains, traditionally committed to high safety standards that are not satisfied with an exclusive testing approach of otherwise inaccessible black-box systems. Especially the interaction between safety and security is a central challenge, as security violations can lead to compromised safety. The contribution of this paper to addressing both safety and security within a single concept of protection applicable during the operation of ML systems is active monitoring of the behaviour and the operational context of the data-driven system based on distance measures of the Empirical Cumulative Distribution Function (ECDF). We investigate abstract datasets (XOR, Spiral, Circle) and current security-specific datasets for intrusion detection (CICIDS2017) of simulated network traffic, using distributional shift detection measures including the Kolmogorov-Smirnov, Kuiper, Anderson-Darling, Wasserstein and mixed Wasserstein-Anderson-Darling measures. Our preliminary findings indicate that the approach can provide a basis for detecting whether the application context of an ML component is valid in the safety-security. Our preliminary code and results are available at https://github.com/ISorokos/SafeML.

Publication:

arXiv e-prints

Pub Date:

May 2020

DOI:

10.48550/arXiv.2005.13166

arXiv:

arXiv:2005.13166

Bibcode:

2020arXiv200513166A

Keywords:

Computer Science - Machine Learning;
Computer Science - Cryptography and Security;
Statistics - Machine Learning

NASA/ADS

SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical Difference Measure

Abstract