Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in a Markov decision process with finite state space and finite number of possible actions. We show its convergence when the discount factor is smaller than $1/2$.

Publication:

arXiv e-prints

Pub Date:

January 2025

arXiv:

arXiv:2501.08800

Bibcode:

2025arXiv250108800D

Keywords:

Mathematics - Probability;
Mathematics - Optimization and Control

ADS

Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

Abstract