Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning
Abstract
We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.00260
- arXiv:
- arXiv:2002.00260
- Bibcode:
- 2020arXiv200200260Q
- Keywords:
-
- Mathematics - Optimization and Control;
- Computer Science - Machine Learning