Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling

doi:10.48550/arXiv.2406.14141

Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling

Load balancing and auto scaling are at the core of scalable, contemporary systems, addressing dynamic resource allocation and service rate adjustments in response to workload changes. This paper introduces a novel model and algorithms for tuning load balancers coupled with auto scalers, considering bursty traffic arriving at finite queues. We begin by presenting the problem as a weakly coupled Markov Decision Processes (MDP), solvable via a linear program (LP). However, as the number of control variables of such LP grows combinatorially, we introduce a more tractable relaxed LP formulation, and extend it to tackle the problem of online parameter learning and policy optimization using a two-timescale algorithm based on the LP Lagrangian.

Publication:

arXiv e-prints

Pub Date:

June 2024

DOI:

10.48550/arXiv.2406.14141

arXiv:

arXiv:2406.14141

Bibcode:

2024arXiv240614141E

Keywords:

Electrical Engineering and Systems Science - Systems and Control;
Computer Science - Artificial Intelligence;
Computer Science - Networking and Internet Architecture

NASA/ADS

Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling

Abstract