GQ($\lambda$) Quick Reference and Implementation Guide

doi:10.48550/arXiv.1705.03967

GQ($\lambda$) Quick Reference and Implementation Guide

This document should serve as a quick reference for and guide to the implementation of linear GQ($\lambda$), a gradient-based off-policy temporal-difference learning algorithm. Explanation of the intuition and theory behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei 2011). If you questions or concerns about the content in this document or the attached java code please email Adam White (adam.white@ualberta.ca). The code is provided as part of the source files in the arXiv submission.

Publication:

arXiv e-prints

Pub Date:

May 2017

DOI:

10.48550/arXiv.1705.03967

arXiv:

arXiv:1705.03967

Bibcode:

2017arXiv170503967W

Keywords:

Computer Science - Machine Learning

NASA/ADS

GQ($\lambda$) Quick Reference and Implementation Guide

Abstract