Top Feasible-Arm Selections in Constrained Multi-Armed Bandit
Abstract
This note studies a simple algorithm, called ``constrained successive accept or reject (CSAR)," for solving the problem of identifying the set of the top feasible-arms in constrained multi-armed bandit with an emphasis on its performance analysis. An upper bound on the probability of incorrect identification by CSAR is established where the bound depends on the problem complexity and the size of the arm set. It converges to zero exponentially fast in the time-horizon as it goes to infinity.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- arXiv:
- arXiv:2401.08845
- Bibcode:
- 2024arXiv240108845C
- Keywords:
-
- Mathematics - Optimization and Control