Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces

Paul N. Beuchat; Angelos Georghiou; John Lygeros

Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces

Systems and Control 2018-08-31 v3

Authors: Paul N. Beuchat , Angelos Georghiou , John Lygeros

Abstract

We study both the value function and Q-function formulation of the Linear Programming approach to Approximate Dynamic Programming. The approach is model-based and optimizes over a restricted function space to approximate the value function or Q-function. Working in the discrete time, continuous space setting, we provide guarantees for the fitting error and online performance of the policy. In particular, the online performance guarantee is obtained by analyzing an iterated version of the greedy policy, and the fitting error guarantee by analyzing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. The Q-function formulation offers benefits, for example, in decentralized controller design, however it can lead to computationally demanding optimization problems. To alleviate this drawback, we provide a condition that simplifies the formulation, resulting in improved computational times.

Keywords

reinforcement learning program analysis control theory

Cite

@article{arxiv.1602.07273,
  title  = {Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces},
  author = {Paul N. Beuchat and Angelos Georghiou and John Lygeros},
  journal= {arXiv preprint arXiv:1602.07273},
  year   = {2018}
}

Comments

18 pages, 5 figures, journal paper

Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces

Abstract

Keywords

Cite

Comments

Related papers