English

Variance Adjusted Actor Critic Algorithms

Machine Learning 2013-10-15 v1 Machine Learning Systems and Control

Abstract

We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function.

Cite

@article{arxiv.1310.3697,
  title  = {Variance Adjusted Actor Critic Algorithms},
  author = {Aviv Tamar and Shie Mannor},
  journal= {arXiv preprint arXiv:1310.3697},
  year   = {2013}
}
R2 v1 2026-06-22T01:46:35.299Z