A policy iteration algorithm for non-Markovian control problems

Dylan Possamaï; Ludovic Tangpi

A policy iteration algorithm for non-Markovian control problems

Optimization and Control 2024-09-09 v1 Probability

Authors: Dylan Possamaï , Ludovic Tangpi

Abstract

In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic control problems which can all be solved explicitly, and only require to solve recursively linear PDEs in the Markovian case. Though our procedure fails in general to produce a non-decreasing sequence like the standard algorithm, it can be made arbitrarily close to being monotone. More importantly, we recover the standard exponential speed of convergence for both the value and the controls, through purely probabilistic arguments which are significantly simpler than in the classical case. Our proof also accommodates non-Markovian dynamics as well as volatility control, allowing us to obtain the first convergence results in the latter case for a state process in multi-dimensions.

Keywords

stochastic control markov decision processes optimal control

Cite

@article{arxiv.2409.04037,
  title  = {A policy iteration algorithm for non-Markovian control problems},
  author = {Dylan Possamaï and Ludovic Tangpi},
  journal= {arXiv preprint arXiv:2409.04037},
  year   = {2024}
}

Comments

18 pages

A policy iteration algorithm for non-Markovian control problems

Abstract

Keywords

Cite

Comments

Related papers