Strategy Improvement for Concurrent Reachability and Safety Games
Abstract
We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all $\epsilon>0$, memoryless $\epsilon$optimal strategies exist. A memoryless strategy is independent of the history of plays, and an $\epsilon$optimal strategy achieves the objective with probability within $\epsilon$ of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategyimprovement (a.k.a.\ policyiteration) algorithm for concurrent games with reachability objectives. We then present a strategyimprovement algorithm for concurrent games with safety objectives. Our algorithms yield sequences of player1 strategies which ensure probabilities of winning that converge monotonically to the value of the game. Our result is significant because the strategyimprovement algorithm for safety games provides, for the first time, a way to approximate the value of a concurrent safety game from below. Previous methods could approximate the values of these games only from one direction, and as no rates of convergence are known, they did not provide a practical way to solve these games.
 Publication:

arXiv eprints
 Pub Date:
 January 2012
 arXiv:
 arXiv:1201.2834
 Bibcode:
 2012arXiv1201.2834C
 Keywords:

 Computer Science  Computer Science and Game Theory
 EPrint:
 arXiv admin note: substantial text overlap with arXiv:0804.4530 and arXiv:0809.4017