Gittins' theorem under uncertainty
Abstract
We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under strong independence of the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2019
- DOI:
- arXiv:
- arXiv:1907.05689
- Bibcode:
- 2019arXiv190705689C
- Keywords:
-
- Mathematics - Optimization and Control;
- Mathematics - Probability;
- Mathematics - Statistics Theory;
- Quantitative Finance - Computational Finance;
- 93E35;
- 60G40;
- 91B32;
- 91B70