Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates
Abstract
Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of multipliers (ADMM) to approximate global NPG directions efficiently. We theoretically demonstrate that using ADMM-based gradient updates reduces communication complexity from ${O}({d^{2}})$ to ${O}({d})$ at each iteration, where $d$ is the number of model parameters. Furthermore, we show that achieving an $\epsilon$-error stationary convergence requires ${O}(\frac{1}{(1-\gamma)^{2}{\epsilon}})$ iterations for discount factor $\gamma$, demonstrating that FedNPG-ADMM maintains the same convergence rate as the standard FedNPG. Through evaluation of the proposed algorithms in MuJoCo environments, we demonstrate that FedNPG-ADMM maintains the reward performance of standard FedNPG, and that its convergence rate improves when the number of federated agents increases.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2023
- DOI:
- 10.48550/arXiv.2310.19807
- arXiv:
- arXiv:2310.19807
- Bibcode:
- 2023arXiv231019807L
- Keywords:
-
- Computer Science - Machine Learning;
- Mathematics - Optimization and Control;
- I.2.6
- E-Print:
- Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)