Related papers: An Efficient Policy Iteration Algorithm for Dynami…

Policy iteration for Hamilton-Jacobi-Bellman equations with control constraints

Policy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case.…

Optimization and Control · Mathematics 2020-05-19 Sudeep Kundu , Karl Kunisch

Convergence Analysis of Policy Iteration

Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…

Systems and Control · Computer Science 2015-05-21 Ali Heydari

A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming

Reinforcement learning based adaptive/approximate dynamic programming (ADP) is a powerful technique to determine an approximate optimal controller for a dynamical system. These methods bypass the need to analytically solve the nonlinear…

Optimization and Control · Mathematics 2018-05-24 Xuefeng Bao , Zhi-Hong Mao , Nitin Sharma

Hamilton-Jacobi Based Policy-Iteration via Deep Operator Learning

The framework of deep operator network (DeepONet) has been widely exploited thanks to its capability of solving high dimensional partial differential equations. In this paper, we incorporate DeepONet with a recently developed policy…

Optimization and Control · Mathematics 2024-06-18 Jae Yong Lee , Yeoneung Kim

Online Adaptive Optimal Control Algorithm Based on Synchronous Integral Reinforcement Learning With Explorations

In this paper, we present a novel algorithm named synchronous integral Q-learning, which is based on synchronous policy iteration, to solve the continuous-time infinite horizon optimal control problems of input-affine system dynamics. The…

Systems and Control · Electrical Eng. & Systems 2021-05-20 Lei Guo , Han Zhao

A Theoretical Difficulty in Approximate Dynamic Programming with Input Constraints

Equipping approximate dynamic programming (ADP) with inputconstraints has a tremendous significance. This enables ADP to be applied tothe systems with actuator limitations, which is quite common for dynamicalsystems. In a conventional…

Optimization and Control · Mathematics 2018-05-24 Xuefeng Bao , Zhi-Hong Mao , Nitin Sharma

On the Convergence of the Policy Iteration for Infinite-Horizon Nonlinear Optimal Control Problems

Policy iteration (PI) is a widely used algorithm for synthesizing optimal feedback control policies across many engineering and scientific applications. When PI is deployed on infinite-horizon, nonlinear, autonomous optimal-control…

Optimization and Control · Mathematics 2025-07-15 Tobias Ehring , Behzad Azmi , Bernard Haasdonk

Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems

This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems. The strategy consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB)…

Dynamical Systems · Mathematics 2017-01-11 Yu Jiang , Zhong-Ping Jiang

Convergence of Policy Iteration for Entropy-Regularized Stochastic Control Problems

For a general entropy-regularized stochastic control problem on an infinite horizon, we prove that a policy iteration algorithm (PIA) converges to an optimal relaxed control. Contrary to the standard stochastic control literature, classical…

Optimization and Control · Mathematics 2026-05-14 Yu-Jui Huang , Zhenhua Wang , Zhou Zhou

Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming

In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…

Systems and Control · Computer Science 2015-10-05 Dimitri P. Bertsekas

Continuous-Time Fitted Value Iteration for Robust Policies

Solving the Hamilton-Jacobi-Bellman equation is important in many domains including control, robotics and economics. Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs…

Robotics · Computer Science 2021-10-06 Michael Lutter , Boris Belousov , Shie Mannor , Dieter Fox , Animesh Garg , Jan Peters

Data-based approximate policy iteration for nonlinear continuous-time optimal control design

This paper addresses the model-free nonlinear optimal problem with generalized cost functional, and a data-based reinforcement learning technique is developed. It is known that the nonlinear optimal control problem relies on the solution of…

Systems and Control · Computer Science 2013-11-20 Biao Luo , Huai-Ning Wu , Tingwen Huang , Derong Liu

An Adaptive Multi-Level Max-Plus Method for Deterministic Optimal Control Problems

We introduce a new numerical method to approximate the solution of a finite horizon deterministic optimal control problem. We exploit two Hamilton-Jacobi-Bellman PDE, arising by considering the dynamics in forward and backward time. This…

Optimization and Control · Mathematics 2023-04-21 Marianne Akian , Stéphane Gaubert , Shanqing Liu

Continuous Policy and Value Iteration for Stochastic Control Problems and Its Convergence

We introduce a continuous policy-value iteration algorithm where the approximations of the value function of a stochastic control problem and the optimal control are simultaneously updated through Langevin-type dynamics. This framework…

Optimization and Control · Mathematics 2025-06-11 Qi Feng , Gu Wang

A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains

In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We…

Numerical Analysis · Mathematics 2020-02-14 Kazufumi Ito , Christoph Reisinger , Yufei Zhang

Policy Iteration for Stationary Discounted Hamilton--Jacobi--Bellman Equations: A Viscosity Approach

We study policy iteration (PI) for deterministic infinite-horizon discounted optimal control problems, whose value function is characterized by a stationary Hamilton--Jacobi--Bellman (HJB) equation. At the PDE level, PI is fundamentally…

Optimization and Control · Mathematics 2026-04-14 Namkyeong Cho , Yeoneung Kim

A policy iteration algorithm for non-Markovian control problems

In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic…

Optimization and Control · Mathematics 2024-09-09 Dylan Possamaï , Ludovic Tangpi

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning

We consider infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. In an earlier work we introduced a policy iteration algorithm, where…

Optimization and Control · Mathematics 2020-05-05 Dimitri Bertsekas

Information-Theoretic Stochastic Optimal Control via Incremental Sampling-based Algorithms

This paper considers optimal control of dynamical systems which are represented by nonlinear stochastic differential equations. It is well-known that the optimal control policy for this problem can be obtained as a function of a value…

Robotics · Computer Science 2014-05-30 Oktay Arslan , Evangelos Theodorou , Panagiotis Tsiotras

An efficient method for multiobjective optimal control and optimal control subject to integral constraints

We introduce a new and efficient numerical method for multicriterion optimal control and single criterion optimal control under integral constraints. The approach is based on extending the state space to include information on a "budget"…

Optimization and Control · Mathematics 2016-01-06 Ajeet Kumar , Alexander Vladimirsky