Diversity-Preserving K-Armed Bandits, Revisited

doi:10.48550/arXiv.2010.01874

Diversity-Preserving K-Armed Bandits, Revisited

We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it in the case of a polytope mainly by a reduction to the setting of linear bandits. We design a UCB algorithm using the specific structure of the setting and show that it enjoys a bounded distribution-dependent regret in the natural cases when the optimal mixed actions put some probability mass on all actions (i.e., when diversity is desirable). The regret lower bounds provided show that otherwise, at least when the model is mean-unbounded, a $\ln T$ regret is suffered. We also discuss an example beyond the special case of polytopes.

Publication:

arXiv e-prints

Pub Date:

October 2020

DOI:

10.48550/arXiv.2010.01874

arXiv:

arXiv:2010.01874

Bibcode:

2020arXiv201001874H

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning

E-Print:

Transactions on Machine Learning Research Journal, 2024, July

NASA/ADS

Diversity-Preserving K-Armed Bandits, Revisited

Abstract