Are There Exceptions to Goodhart's Law? On the Moral Justification of Fairness-Aware Machine Learning
Abstract
Fairness-aware machine learning (fair-ml) techniques are algorithmic interventions designed to ensure that individuals who are affected by the predictions of a machine learning model are treated fairly. The problem is often posed as an optimization problem, where the objective is to achieve high predictive performance under a quantitative fairness constraint. However, any attempt to design a fair-ml algorithm must assume a world where Goodhart's law has an exception: when a fairness measure becomes an optimization constraint, it does not cease to be a good measure. In this paper, we argue that fairness measures are particularly sensitive to Goodhart's law. Our main contributions are as follows. First, we present a framework for moral reasoning about the justification of fairness metrics. In contrast to existing work, our framework incorporates the belief that whether a distribution of outcomes is fair, depends not only on the cause of inequalities but also on what moral claims decision subjects have to receive a particular benefit or avoid a burden. We use the framework to distil moral and empirical assumptions under which particular fairness metrics correspond to a fair distribution of outcomes. Second, we explore the extent to which employing fairness metrics as a constraint in a fair-ml algorithm is morally justifiable, exemplified by the fair-ml algorithm introduced by Hardt et al. (2016). We illustrate that enforcing a fairness metric through a fair-ml algorithm often does not result in the fair distribution of outcomes that motivated its use and can even harm the individuals the intervention was intended to protect.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- arXiv:
- arXiv:2202.08536
- Bibcode:
- 2022arXiv220208536W
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence;
- Computer Science - Computers and Society