FlipTest: Fairness Testing via Optimal Transport
Abstract
We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a "flipset": the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a "transparency report": a ranking of features that are most associated with the model's behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2019
- DOI:
- 10.48550/arXiv.1906.09218
- arXiv:
- arXiv:1906.09218
- Bibcode:
- 2019arXiv190609218B
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- Accepted to ACM FAT* 2020