Efficient Contextual Bandits with Continuous Actions

doi:10.48550/arXiv.2006.06040

Efficient Contextual Bandits with Continuous Actions

We create a computationally tractable algorithm for contextual bandits with continuous actions having unknown structure. Our reduction-style algorithm composes with most supervised learning representations. We prove that it works in a general sense and verify the new functionality with large-scale experiments.

Publication:

arXiv e-prints

Pub Date:

June 2020

DOI:

10.48550/arXiv.2006.06040

arXiv:

arXiv:2006.06040

Bibcode:

2020arXiv200606040M

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

To appear at NeurIPS 2020

NASA/ADS

Efficient Contextual Bandits with Continuous Actions

Abstract