Related papers: A policy iteration algorithm for non-Markovian con…

Continuous Policy and Value Iteration for Stochastic Control Problems and Its Convergence

We introduce a continuous policy-value iteration algorithm where the approximations of the value function of a stochastic control problem and the optimal control are simultaneously updated through Langevin-type dynamics. This framework…

Optimization and Control · Mathematics 2025-06-11 Qi Feng , Gu Wang

Value-Gradient Iteration with Quadratic Approximate Value Functions

We propose a method for designing policies for convex stochastic control problems characterized by random linear dynamics and convex stage cost. We consider policies that employ quadratic approximate value functions as a substitute for the…

Optimization and Control · Mathematics 2023-11-10 Alan Yang , Stephen Boyd

Approximate Midpoint Policy Iteration for Linear Quadratic Control

We present a midpoint policy iteration algorithm to solve linear quadratic optimal control problems in both model-based and model-free settings. The algorithm is a variation of Newton's method, and we show that in the model-based setting it…

Optimization and Control · Mathematics 2022-02-16 Benjamin Gravell , Iman Shames , Tyler Summers

An Incremental Sampling-based Algorithm for Stochastic Optimal Control

In this paper, we consider a class of continuous-time, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation methods and sampling-based algorithms for deterministic path planning,…

Robotics · Computer Science 2012-02-27 Vu Anh Huynh , Sertac Karaman , Emilio Frazzoli

Constrained Policy Optimization for Stochastic Optimal Control under Nonstationary Uncertainties

This article presents a constrained policy optimization approach for the optimal control of systems under nonstationary uncertainties. We introduce an assumption that we call Markov embeddability that allows us to cast the stochastic…

Optimization and Control · Mathematics 2026-05-11 Sungho Shin , François Pacaud , Emil Contantinescu , Mihai Anitescu

Policy Iteration for Multiplicative Noise Output Feedback Control

We propose a policy iteration algorithm for solving the multiplicative noise linear quadratic output feedback design problem. The algorithm solves a set of coupled Riccati equations for estimation and control arising from a partially…

Systems and Control · Electrical Eng. & Systems 2022-04-01 Benjamin Gravell , Matilde Gargiani , John Lygeros , Tyler H. Summers

On gradual-impulse control of continuous-time Markov decision processes with exponential utility

In this paper, we consider the gradual-impulse control problem of continuous-time Markov decision processes, where the system performance is measured by the expectation of the exponential utility of the total cost. We prove, under very…

Optimization and Control · Mathematics 2023-11-16 Xin Guo , Aiko Kurushima , Alexey Piunovskiy , Yi Zhang

Computational methods for stochastic control with metric interval temporal logic specifications

This paper studies an optimal control problem for continuous-time stochastic systems subject to reachability objectives specified in a subclass of metric interval temporal logic specifications, a temporal logic with real-time constraints.…

Systems and Control · Computer Science 2015-04-21 Jie Fu , Ufuk Topcu

Convergence of Policy Iteration for Entropy-Regularized Stochastic Control Problems

For a general entropy-regularized stochastic control problem on an infinite horizon, we prove that a policy iteration algorithm (PIA) converges to an optimal relaxed control. Contrary to the standard stochastic control literature, classical…

Optimization and Control · Mathematics 2026-05-14 Yu-Jui Huang , Zhenhua Wang , Zhou Zhou

Data-driven policy iteration algorithm for continuous-time stochastic linear-quadratic optimal control problems

This paper studies a continuous-time stochastic linear-quadratic (SLQ) optimal control problem on infinite-horizon. A data-driven policy iteration algorithm is proposed to solve the SLQ problem. Without knowing three system coefficient…

Optimization and Control · Mathematics 2022-09-30 Heng Zhang , Na Li

Continuous-time iterative linear-quadratic regulator

We present a continuous-time equivalent to the well-known iterative linear-quadratic algorithm including an implementation of a backtracking line-search policy and a novel regularization approach based on the necessary conditions in the…

Systems and Control · Electrical Eng. & Systems 2025-05-22 Juraj Lieskovský , Jaroslav Bušek , Tomáš Vyhlídal

The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP's) taking values in a general Borel space and with…

Probability · Mathematics 2009-02-17 O. L. V. Costa , F. Dufour

Convergence Analysis for Entropy-Regularized Control Problems: A Probabilistic Approach

In this paper we investigate the convergence of the Policy Iteration Algorithm (PIA) for a class of general continuous-time entropy-regularized stochastic control problems. In particular, instead of employing sophisticated PDE estimates for…

Optimization and Control · Mathematics 2025-04-24 Jin Ma , Gaozhan Wang , Jianfeng Zhang

Convergence Analysis of Policy Iteration

Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…

Systems and Control · Computer Science 2015-05-21 Ali Heydari

Policy Iteration for Relational MDPs

Relational Markov Decision Processes are a useful abstraction for complex reinforcement learning problems and stochastic planning problems. Recent work developed representation schemes and algorithms for planning in such problems using the…

Artificial Intelligence · Computer Science 2012-06-26 Chenggang Wang , Roni Khardon

A Machine Learning Algorithm for Finite-Horizon Stochastic Control Problems in Economics

We propose a machine learning algorithm for solving finite-horizon stochastic control problems based on a deep neural network representation of the optimal policy functions. The algorithm has three features: (1) It can solve…

General Economics · Economics 2024-12-09 Xianhua Peng , Steven Kou , Lekang Zhang

Stochastic Approximation with Markov Noise: Analysis and applications in reinforcement learning

We present for the first time an asymptotic convergence analysis of two time-scale stochastic approximation driven by "controlled" Markov noise. In particular, the faster and slower recursions have non-additive controlled Markov noise…

Machine Learning · Computer Science 2020-12-03 Prasenjit Karmakar

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies

We consider stochastic control models with Borel spaces and universally measurable policies. For such models the standard policy iteration is known to have difficult measurability issues and cannot be carried out in general. We present a…

Optimization and Control · Mathematics 2016-02-26 Huizhen Yu , Dimitri P. Bertsekas

An Efficient Policy Iteration Algorithm for Dynamic Programming Equations

We present an accelerated algorithm for the solution of static Hamilton-Jacobi-Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear…

Optimization and Control · Mathematics 2016-02-22 Alessandro Alla , Maurizio Falcone , Dante Kalise

Computing Optimal Joint Chance Constrained Control Policies

We consider the problem of optimally controlling stochastic, Markovian systems subject to joint chance constraints over a finite-time horizon. For such problems, standard Dynamic Programming is inapplicable due to the time correlation of…

Optimization and Control · Mathematics 2024-11-22 Niklas Schmid , Marta Fochesato , Sarah H. Q. Li , Tobias Sutter , John Lygeros