Enhancing Efficiency in Multidevice Federated Learning through Data Selection
Abstract
Federated learning (FL) in multidevice environments creates new opportunities to learn from a vast and diverse amount of private data. Although personal devices capture valuable data, their memory, computing, connectivity, and battery resources are often limited. Since deep neural networks (DNNs) are the typical machine learning models employed in FL, there are demands for integrating ubiquitous constrained devices into the training process of DNNs. In this paper, we develop an FL framework to incorporate on-device data selection on such constrained devices, which allows partition-based training of a DNN through collaboration between constrained devices and resourceful devices of the same client. Evaluations on five benchmark DNNs and six benchmark datasets across different modalities show that, on average, our framework achieves ~19% higher accuracy and ~58% lower latency; compared to the baseline FL without our implemented strategies. We demonstrate the effectiveness of our FL framework when dealing with imbalanced data, client participation heterogeneity, and various mobility patterns. As a benchmark for the community, our code is available at https://github.com/dr-bell/data-centric-federated-learning
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2022
- DOI:
- 10.48550/arXiv.2211.04175
- arXiv:
- arXiv:2211.04175
- Bibcode:
- 2022arXiv221104175M
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- Previous version (v3) was presented at ICLR 2023 Workshop on Machine Learning for IoT: Datasets, Perception, and Understanding