Learning While Scheduling in Multi-Server Systems with Unknown Statistics: MaxWeight with Discounted UCB
Abstract
Multi-server queueing systems are widely used models for job scheduling in machine learning, wireless networks, crowdsourcing, and healthcare systems. This paper considers a multi-server system with multiple servers and multiple types of jobs, where different job types require different amounts of processing time at different servers. The goal is to schedule jobs on servers without knowing the statistics of the processing times. To fully utilize the processing power of the servers, it is known that one has to at least learn the service rates of different job types on different servers. Prior works on this topic decouple the learning and scheduling phases which leads to either excessive exploration or extremely large job delays. We propose a new algorithm, which combines the MaxWeight scheduling policy with discounted upper confidence bound (UCB), to simultaneously learn the statistics and schedule jobs to servers. We prove that under our algorithm the asymptotic average queue length is bounded by one divided by the traffic slackness, which is order-wise optimal. We also obtain an exponentially decaying probability tail bound for any-time queue length. These results hold for both stationary and nonstationary service rates. Simulations confirm that the delay performance of our algorithm is several orders of magnitude better than previously proposed algorithms.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2022
- DOI:
- 10.48550/arXiv.2209.01126
- arXiv:
- arXiv:2209.01126
- Bibcode:
- 2022arXiv220901126Y
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Networking and Internet Architecture