Related papers: An Efficient Policy Iteration Algorithm for Dynami…
Policy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case.…
Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…
Reinforcement learning based adaptive/approximate dynamic programming (ADP) is a powerful technique to determine an approximate optimal controller for a dynamical system. These methods bypass the need to analytically solve the nonlinear…
The framework of deep operator network (DeepONet) has been widely exploited thanks to its capability of solving high dimensional partial differential equations. In this paper, we incorporate DeepONet with a recently developed policy…
In this paper, we present a novel algorithm named synchronous integral Q-learning, which is based on synchronous policy iteration, to solve the continuous-time infinite horizon optimal control problems of input-affine system dynamics. The…
Equipping approximate dynamic programming (ADP) with inputconstraints has a tremendous significance. This enables ADP to be applied tothe systems with actuator limitations, which is quite common for dynamicalsystems. In a conventional…
Policy iteration (PI) is a widely used algorithm for synthesizing optimal feedback control policies across many engineering and scientific applications. When PI is deployed on infinite-horizon, nonlinear, autonomous optimal-control…
This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems. The strategy consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB)…
For a general entropy-regularized stochastic control problem on an infinite horizon, we prove that a policy iteration algorithm (PIA) converges to an optimal relaxed control. Contrary to the standard stochastic control literature, classical…
In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…
Solving the Hamilton-Jacobi-Bellman equation is important in many domains including control, robotics and economics. Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs…
This paper addresses the model-free nonlinear optimal problem with generalized cost functional, and a data-based reinforcement learning technique is developed. It is known that the nonlinear optimal control problem relies on the solution of…
We introduce a new numerical method to approximate the solution of a finite horizon deterministic optimal control problem. We exploit two Hamilton-Jacobi-Bellman PDE, arising by considering the dynamics in forward and backward time. This…
We introduce a continuous policy-value iteration algorithm where the approximations of the value function of a stochastic control problem and the optimal control are simultaneously updated through Langevin-type dynamics. This framework…
In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We…
We study policy iteration (PI) for deterministic infinite-horizon discounted optimal control problems, whose value function is characterized by a stationary Hamilton--Jacobi--Bellman (HJB) equation. At the PDE level, PI is fundamentally…
In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic…
We consider infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. In an earlier work we introduced a policy iteration algorithm, where…
This paper considers optimal control of dynamical systems which are represented by nonlinear stochastic differential equations. It is well-known that the optimal control policy for this problem can be obtained as a function of a value…
We introduce a new and efficient numerical method for multicriterion optimal control and single criterion optimal control under integral constraints. The approach is based on extending the state space to include information on a "budget"…