Combinatorial Multi-armed Bandits for Real-Time Strategy Games
Abstract
Games with large branching factors pose a significant challenge for game tree search algorithms. In this paper, we address this problem with a sampling strategy for Monte Carlo Tree Search (MCTS) algorithms called {\em naïve sampling}, based on a variant of the Multi-armed Bandit problem called {\em Combinatorial Multi-armed Bandits} (CMAB). We analyze the theoretical properties of several variants of {\em naïve sampling}, and empirically compare it against the other existing strategies in the literature for CMABs. We then evaluate these strategies in the context of real-time strategy (RTS) games, a genre of computer games characterized by their very large branching factors. Our results show that as the branching factor grows, {\em naïve sampling} outperforms the other sampling strategies.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2017
- DOI:
- 10.48550/arXiv.1710.04805
- arXiv:
- arXiv:1710.04805
- Bibcode:
- 2017arXiv171004805O
- Keywords:
-
- Computer Science - Artificial Intelligence
- E-Print:
- (2017) Journal of Artificial Intelligence Research (JAIR). Volume 58, pp 665-702