Büchi Objectives in Countable MDPs
Abstract
We study countably infinite Markov decision processes with Büchi objectives, which ask to visit a given subset of states infinitely often. A question left open by T.P. Hill in 1979 is whether there always exist $\varepsilon$optimal Markov strategies, i.e., strategies that base decisions only on the current state and the number of steps taken so far. We provide a negative answer to this question by constructing a nontrivial counterexample. On the other hand, we show that Markov strategies with only 1 bit of extra memory are sufficient.
 Publication:

arXiv eprints
 Pub Date:
 April 2019
 arXiv:
 arXiv:1904.11573
 Bibcode:
 2019arXiv190411573K
 Keywords:

 Mathematics  Probability;
 Computer Science  Formal Languages and Automata Theory;
 Mathematics  Optimization and Control
 EPrint:
 full version of an ICALP'19 paper. This update only fixes some typesetting issues