Taylor Expansion Policy Optimization

doi:10.48550/arXiv.2003.06259

Taylor Expansion Policy Optimization

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

Publication:

arXiv e-prints

Pub Date:

March 2020

DOI:

10.48550/arXiv.2003.06259

arXiv:

arXiv:2003.06259

Bibcode:

2020arXiv200306259T

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

NASA/ADS

Taylor Expansion Policy Optimization

Abstract