Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Machine Learning
2016-05-10 v1
Abstract
This is a companion note to our recent study of the weak convergence properties of constrained emphatic temporal-difference learning (ETD) algorithms from a theoretic perspective. It supplements the latter analysis with simulation results and illustrates the behavior of some of the ETD algorithms using three example problems.
Cite
@article{arxiv.1605.02099,
title = {Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms},
author = {Huizhen Yu},
journal= {arXiv preprint arXiv:1605.02099},
year = {2016}
}
Comments
A companion note to the article arxiv:1511.07471; 30 pages; 34 figures, best viewed on screen