GQ($\lambda$) Quick Reference and Implementation Guide

Adam White; Richard S. Sutton

GQ($\lambda$) Quick Reference and Implementation Guide

Machine Learning 2017-05-12 v1

Authors: Adam White , Richard S. Sutton

Abstract

This document should serve as a quick reference for and guide to the implementation of linear GQ( $\lambda$ ), a gradient-based off-policy temporal-difference learning algorithm. Explanation of the intuition and theory behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei 2011). If you questions or concerns about the content in this document or the attached java code please email Adam White (adam.white@ualberta.ca). The code is provided as part of the source files in the arXiv submission.

Cite

@article{arxiv.1705.03967,
  title  = {GQ($\lambda$) Quick Reference and Implementation Guide},
  author = {Adam White and Richard S. Sutton},
  journal= {arXiv preprint arXiv:1705.03967},
  year   = {2017}
}

Related papers

View all related →