English

Manifold Regularization for Kernelized LSTD

Machine Learning 2017-10-17 v1 Artificial Intelligence Machine Learning

Abstract

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL). It is a necessary component of policy iteration and can be used for variance reduction in policy gradient methods. Therefore its quality has a significant impact on most RL algorithms. Motivated by manifold regularized learning, we propose a novel kernelized policy evaluation method that takes advantage of the intrinsic geometry of the state space learned from data, in order to achieve better sample efficiency and higher accuracy in Q-function approximation. Applying the proposed method in the Least-Squares Policy Iteration (LSPI) framework, we observe superior performance compared to widely used parametric basis functions on two standard benchmarks in terms of policy quality.

Keywords

Cite

@article{arxiv.1710.05387,
  title  = {Manifold Regularization for Kernelized LSTD},
  author = {Xinyan Yan and Krzysztof Choromanski and Byron Boots and Vikas Sindhwani},
  journal= {arXiv preprint arXiv:1710.05387},
  year   = {2017}
}

Comments

6 pages, CoRL 2017 non-archival track

R2 v1 2026-06-22T22:14:08.973Z