Improved Strongly Polynomial Algorithms for Deterministic MDPs, 2VPI Feasibility, and Discounted All-Pairs Shortest Paths
Abstract
We revisit the problem of finding optimal strategies for deterministic Markov Decision Processes (DMDPs), and a closely related problem of testing feasibility of systems of $m$ linear inequalities on $n$ real variables with at most two variables per inequality (2VPI). We give a randomized trade-off algorithm solving both problems and running in $\tilde{O}(nmh+(n/h)^3)$ time using $\tilde{O}(n^2/h+m)$ space for any parameter $h\in [1,n]$. In particular, using subquadratic space we get $\tilde{O}(nm+n^{3/2}m^{3/4})$ running time, which improves by a polynomial factor upon all the known upper bounds for non-dense instances with $m=O(n^{2-\epsilon})$. Moreover, using linear space we match the randomized $\tilde{O}(nm+n^3)$ time bound of Cohen and Megiddo [SICOMP'94] that required $\tilde{\Theta}(n^2+m)$ space. Additionally, we show a new algorithm for the Discounted All-Pairs Shortest Paths problem, introduced by Madani et al. [TALG'10], that extends the DMDPs with optional end vertices. For the case of uniform discount factors, we give a deterministic algorithm running in $\tilde{O}(n^{3/2}m^{3/4})$ time, which improves significantly upon the randomized bound $\tilde{O}(n^2\sqrt{m})$ of Madani et al.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2021
- DOI:
- arXiv:
- arXiv:2110.15070
- Bibcode:
- 2021arXiv211015070K
- Keywords:
-
- Computer Science - Data Structures and Algorithms
- E-Print:
- Full version of a SODA'22 paper