A Novel Pearson Correlation-Based Merging Algorithm for Robust Distributed Machine Learning with Heterogeneous Data
Abstract
Federated learning faces significant challenges in scenarios with heterogeneous data distributions and adverse network conditions, such as delays, packet loss, and data poisoning attacks. This paper proposes a novel method based on the SCAFFOLD algorithm to improve the quality of local updates and enhance the robustness of the global model. The key idea is to form intermediary nodes by merging local models with high similarity, using the Pearson correlation coefficient as a similarity measure. The proposed merging algorithm reduces the number of local nodes while maintaining the accuracy of the global model, effectively addressing communication overhead and bandwidth consumption. Experimental results on the MNIST dataset under simulated federated learning scenarios demonstrate the method's effectiveness. After 10 rounds of training using a CNN model, the proposed approach achieved accuracies of 0.82, 0.73, and 0.66 under normal conditions, packet loss and data poisoning attacks, respectively, outperforming the baseline SCAFFOLD algorithm. These results highlight the potential of the proposed method to improve efficiency and resilience in federated learning systems.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.11112
- Bibcode:
- 2025arXiv250111112G
- Keywords:
-
- Computer Science - Machine Learning