High-Dimensional Sparse Linear Bandits

doi:10.48550/arXiv.2011.04020

High-Dimensional Sparse Linear Bandits

Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising. We derive a novel $\Omega(n^{2/3})$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that $\Theta(n^{2/3})$ is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free $O(\sqrt{n})$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.

Publication:

arXiv e-prints

Pub Date:

November 2020

DOI:

10.48550/arXiv.2011.04020

arXiv:

arXiv:2011.04020

Bibcode:

2020arXiv201104020H

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning;
Mathematics - Statistics Theory

E-Print:

Accepted by NeurIPS 2020

NASA/ADS

High-Dimensional Sparse Linear Bandits

Abstract