Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale

doi:10.48550/arXiv.2403.00222

Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale

We study reinforcement learning for global decision-making in the presence of local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the joint rewards of all the agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing challenge due to the size of the state space which can be exponential in the number of agents. This work proposes the \texttt{SUBSAMPLE-Q} algorithm where the global agent subsamples $k\leq n$ local agents to compute a policy in time that is polynomial in $k$. We show that this learned policy converges to the optimal policy in the order of $\tilde{O}(1/\sqrt{k}+{\epsilon}_{k,m})$ as the number of sub-sampled agents $k$ increases, where ${\epsilon}_{k,m}$ is the Bellman noise. Finally, we validate the theory through numerical simulations in a demand-response setting and a queueing setting.

Publication:

arXiv e-prints

Pub Date:

February 2024

DOI:

10.48550/arXiv.2403.00222

arXiv:

arXiv:2403.00222

Bibcode:

2024arXiv240300222A

Keywords:

Computer Science - Machine Learning;
Computer Science - Multiagent Systems;
I.2.6

E-Print:

34 pages, 6 figures

NASA/ADS

Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale

Abstract