Nearly Minimax Algorithms for Linear Bandits with Shared Representation

doi:10.48550/arXiv.2203.15664

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

We give novel algorithms for multi-task and lifelong linear bandits with shared representation. Specifically, we consider the setting where we play $M$ linear bandits with dimension $d$, each for $T$ rounds, and these $M$ bandit tasks share a common $k(\ll d)$ dimensional linear representation. For both the multi-task setting where we play the tasks concurrently, and the lifelong setting where we play tasks sequentially, we come up with novel algorithms that achieve $\widetilde{O}\left(d\sqrt{kMT} + kM\sqrt{T}\right)$ regret bounds, which matches the known minimax regret lower bound up to logarithmic factors and closes the gap in existing results [Yang et al., 2021]. Our main technique include a more efficient estimator for the low-rank linear feature extractor and an accompanied novel analysis for this estimator.

Publication:

arXiv e-prints

Pub Date:

March 2022

DOI:

10.48550/arXiv.2203.15664

arXiv:

arXiv:2203.15664

Bibcode:

2022arXiv220315664Y

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

19 pages, 3 figures

NASA/ADS

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

Abstract