Related papers: Regular Policies in Abstract Dynamic Programming
In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…
In this paper we consider a broad class of infinite horizon discrete-time optimal control models that involve a nonnegative cost function and an affine mapping in their dynamic programming equation. They include as special cases classical…
Policy iteration and value iteration are at the core of many (approximate) dynamic programming methods. For Markov Decision Processes with finite state and action spaces, we show that they are instances of semismooth Newton-type methods to…
When sales of a product are affected by randomness in demand, retailers can use dynamic pricing strategies to maximise their profits. In this article the pricing problem is formulated as a stochastic optimal control problem, where the…
Recent work [Ran22] formulated a class of optimal control problems involving positive linear systems, linear stage costs, and elementwise constraints on control. It was shown that the problem admits linear optimal cost and the associated…
We consider stochastic shortest path problems with infinite state and control spaces, a nonnegative cost per stage, and a termination state. We extend the notion of a proper policy, a policy that terminates within a finite expected number…
We consider how to use the Bellman residual of the dynamic programming operator to compute suboptimality bounds for solutions to stochastic shortest path problems. Such bounds have been previously established only in the special case that…
We consider infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. In an earlier work we introduced a policy iteration algorithm, where…
We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…
Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…
We present an accelerated algorithm for the solution of static Hamilton-Jacobi-Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear…
We consider discrete-time infinite horizon deterministic optimal control problems with nonnegative cost per stage, and a destination that is cost-free and absorbing. The classical linear-quadratic regulator problem is a special case. Our…
In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic…
This paper provides new conditions for dynamic optimality in discrete time and uses them to establish fundamental dynamic programming results for several commonly used recursive preference specifications. These include Epstein-Zin…
We consider a broad class of dynamic programming (DP) problems that involve a partially linear structure and some positivity properties in their system equation and cost function. We address deterministic and stochastic problems, possibly…
This paper studies stochastic optimization problems and associated Bellman equations in formats that allow for reduced dimensionality of the cost-to-go functions. In particular, we study stochastic control problems in the…
Relational Markov Decision Processes are a useful abstraction for complex reinforcement learning problems and stochastic planning problems. Recent work developed representation schemes and algorithms for planning in such problems using the…
New approaches to the theory of dynamic programming view dynamic programs as families of policy operators acting on partially ordered sets. In this paper, we extend these ideas by shifting from arbitrary partially ordered sets to ordered…
Entropy regularized algorithms such as Soft Q-learning and Soft Actor-Critic, recently showed state-of-the-art performance on a number of challenging reinforcement learning (RL) tasks. The regularized formulation modifies the standard RL…
Recent progress in randomized motion planners has led to the development of a new class of sampling-based algorithms that provide asymptotic optimality guarantees, notably the RRT* and the PRM* algorithms. Careful analysis reveals that the…