Manifold Regularization for Kernelized LSTD

Xinyan Yan; Krzysztof Choromanski; Byron Boots; Vikas Sindhwani

Manifold Regularization for Kernelized LSTD

Machine Learning 2017-10-17 v1 Artificial Intelligence Machine Learning

Authors: Xinyan Yan , Krzysztof Choromanski , Byron Boots , Vikas Sindhwani

Abstract

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL). It is a necessary component of policy iteration and can be used for variance reduction in policy gradient methods. Therefore its quality has a significant impact on most RL algorithms. Motivated by manifold regularized learning, we propose a novel kernelized policy evaluation method that takes advantage of the intrinsic geometry of the state space learned from data, in order to achieve better sample efficiency and higher accuracy in Q-function approximation. Applying the proposed method in the Least-Squares Policy Iteration (LSPI) framework, we observe superior performance compared to widely used parametric basis functions on two standard benchmarks in terms of policy quality.

Keywords

reinforcement learning policy gradient kernel learning

Cite

@article{arxiv.1710.05387,
  title  = {Manifold Regularization for Kernelized LSTD},
  author = {Xinyan Yan and Krzysztof Choromanski and Byron Boots and Vikas Sindhwani},
  journal= {arXiv preprint arXiv:1710.05387},
  year   = {2017}
}

Comments

6 pages, CoRL 2017 non-archival track

Manifold Regularization for Kernelized LSTD

Abstract

Keywords

Cite

Comments

Related papers