Differential Good Arm Identification
Abstract
This paper targets a variant of the stochastic multi-armed bandit problem called good arm identification (GAI). GAI is a pure-exploration bandit problem with the goal to output as many good arms using as few samples as possible, where a good arm is defined as an arm whose expected reward is greater than a given threshold. In this work, we propose DGAI - a differentiable good arm identification algorithm to improve the sample complexity of the state-of-the-art HDoC algorithm in a data-driven fashion. We also showed that the DGAI can further boost the performance of a general multi-arm bandit (MAB) problem given a threshold as a prior knowledge to the arm set. Extensive experiments confirm that our algorithm outperform the baseline algorithms significantly in both synthetic and real world datasets for both GAI and MAB tasks.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2023
- DOI:
- 10.48550/arXiv.2303.07154
- arXiv:
- arXiv:2303.07154
- Bibcode:
- 2023arXiv230307154T
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning