Editing a classifier by rewriting its prediction rules

doi:10.48550/arXiv.2112.01008

Editing a classifier by rewriting its prediction rules

We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. Our code is available at https://github.com/MadryLab/EditingClassifiers .

Publication:

arXiv e-prints

Pub Date:

December 2021

DOI:

10.48550/arXiv.2112.01008

arXiv:

arXiv:2112.01008

Bibcode:

2021arXiv211201008S

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition

NASA/ADS

Editing a classifier by rewriting its prediction rules

Abstract