Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game
Abstract
Decentralised learning enables the training of deep learning algorithms without centralising data sets, resulting in benefits such as improved data privacy, operational efficiency and the fostering of data ownership policies. However, significant data imbalances pose a challenge in this framework. Participants with smaller datasets in distributed learning environments often achieve poorer results than participants with larger datasets. Data imbalances are particularly pronounced in medical fields and are caused by different patient populations, technological inequalities and divergent data collection practices. In this paper, we consider distributed learning as an Stackelberg evolutionary game. We present two algorithms for setting the weights of each node's contribution to the global model in each training round: the Deterministic Stackelberg Weighting Model (DSWM) and the Adaptive Stackelberg Weighting Model (ASWM). We use three medical datasets to highlight the impact of dynamic weighting on underrepresented nodes in distributed learning. Our results show that the ASWM significantly favours underrepresented nodes by improving their performance by 2.713% in AUC. Meanwhile, nodes with larger datasets experience only a modest average performance decrease of 0.441%.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- arXiv:
- arXiv:2412.16079
- Bibcode:
- 2024arXiv241216079N
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Computer Science and Game Theory;
- Computer Science - Neural and Evolutionary Computing