On Convergence Analysis of Policy Iteration Algorithms for Entropy-Regularized Stochastic Control Problems
Abstract
In this paper we investigate the issues regarding the convergence of the Policy Iteration Algorithm(PIA) for a class of general continuous-time entropy-regularized stochastic control problems. In particular, instead of employing sophisticated PDE estimates for the iterative PDEs involved in the PIA (see, e.g., Huang-Wang-Zhou(2023)), we shall provide a simple proof from scratch for the convergence of the PIA. Our approach builds on probabilistic representation formulae for solutions of PDEs and their derivatives. Moreover, in the infinite horizon model with large discount factor and in the finite horizon model, the similar arguments lead to the exponential rate of convergence of PIA without tear. Finally, with some extra efforts we show that our approach can also be extended to the case when diffusion contains control, in the one dimensional setting but without much extra constraints on the coefficients. We believe that these results are new in the literature.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2024
- DOI:
- 10.48550/arXiv.2406.10959
- arXiv:
- arXiv:2406.10959
- Bibcode:
- 2024arXiv240610959M
- Keywords:
-
- Mathematics - Optimization and Control;
- Computer Science - Machine Learning;
- 93E35;
- 60H30;
- 35Q93
- E-Print:
- In this version, we have added results on convergence and rate of convergence for the diffusion control problem in the scalar case