Related papers: Dynamic Programming with Recursive Preferences: Op…
New approaches to the theory of dynamic programming view dynamic programs as families of policy operators acting on partially ordered sets. In this paper, we extend these ideas by shifting from arbitrary partially ordered sets to ordered…
Some approaches to solving challenging dynamic programming problems, such as Q-learning, begin by transforming the Bellman equation into an alternative functional equation, in order to open up a new line of attack. Our paper studies this…
We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of…
In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…
We study relationships between dynamic programs by applying conjugacy methods from dynamical systems theory. When two dynamic programs are connected by an order isomorphism, we show that optimality properties transmit from one formulation…
Recent work [Ran22] formulated a class of optimal control problems involving positive linear systems, linear stage costs, and elementwise constraints on control. It was shown that the problem admits linear optimal cost and the associated…
Bellman formulated a vague principle for optimization over time, which characterizes optimal policies by stating that a decision maker should not regret previous decisions retrospectively. This paper addresses time consistency in stochastic…
In this paper we consider a broad class of infinite horizon discrete-time optimal control models that involve a nonnegative cost function and an affine mapping in their dynamic programming equation. They include as special cases classical…
This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above…
Markov decision problems are most commonly solved via dynamic programming. Another approach is Bellman residual minimization, which directly minimizes the squared Bellman residual objective function. However, compared to dynamic…
The principle of optimality is a fundamental aspect of dynamic programming, which states that the optimal solution to a dynamic optimization problem can be found by combining the optimal solutions to its sub-problems. While this principle…
Differential Dynamic Programming is an optimal control technique often used for trajectory generation. Many variations of this algorithm have been developed in the literature, including algorithms for stochastic dynamics or state and input…
This paper investigates discrete-time Markov decision processes with recursive utilities (or payoffs) defined by the classic CES aggregator and the Kreps-Porteus certainty equivalent operator. According to the classification introduced by…
In the theory of dynamic programming, an optimal policy is a policy whose lifetime value dominates that of all other policies from every possible initial condition in the state space. This raises a natural question: when does optimality…
Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the…
The solutions to many sequential decision-making problems are characterized by dynamic programming and Bellman's principle of optimality. However, due to the inherent complexity of solving Bellman's equation exactly, there has been…
This paper aims to explore the relationship between maximum principle and dynamic programming principle for stochastic recursive control problem with random coefficients. Under certain regular conditions for the coefficients, the…
We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time…
In this paper, we study one kind of stochastic recursive optimal control problem with the obstacle constraints for the cost function where the cost function is described by the solution of one reflected backward stochastic differential…
In this paper, we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy…