Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

doi:10.48550/arXiv.1805.10005

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD($\lambda$)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD($\lambda$)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD($\lambda$)-RP can benefit from random projection and eligibility traces strategies, and LSTD($\lambda$)-RP can achieve better performances than prior LSTD-RP and LSTD($\lambda$) algorithms.

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.10005

arXiv:

arXiv:1805.10005

Bibcode:

2018arXiv180510005L

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence;
Statistics - Machine Learning

E-Print:

IJCAI 2018

NASA/ADS

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Abstract