Related papers: Uniform value in Dynamic Programming
We consider a dynamic programming problem with arbitrary state space and bounded rewards. Is it possible to define in an unique way a limit value for the problem, where the "patience" of the decision-maker tends to infinity ? We consider,…
We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamically partitioned into regions where the value function is the…
We consider a broad class of dynamic programming (DP) problems that involve a partially linear structure and some positivity properties in their system equation and cost function. We address deterministic and stochastic problems, possibly…
We incorporate safety specifications into dynamic programming. Explicitly, we address the minimization problem of a Markov decision process up to a stopping time with safety constraints. To incorporate safety into dynamic programming, we…
We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…
In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a very robust notion of value for the infinitely repeated problem, namely the pathwise uniform value. This solves two open…
In the theory of dynamic programming, an optimal policy is a policy whose lifetime value dominates that of all other policies from every possible initial condition in the state space. This raises a natural question: when does optimality…
We prove in a dynamic programming framework that uniform convergence of the finite horizon values implies that asymptotically the average accumulated payoff is constant on optimal trajectories. We analyze and discuss several possible…
This paper extends the core results of discrete time infinite horizon dynamic programming to the case of state-dependent discounting. We obtain a condition on the discount factor process under which all of the standard optimality results…
The principle of optimality is a fundamental aspect of dynamic programming, which states that the optimal solution to a dynamic optimization problem can be found by combining the optimal solutions to its sub-problems. While this principle…
Nonzero sum games typically have multiple Nash equilibriums (or no equilibrium), and unlike the zero sum case, they may have different values at different equilibriums. Instead of focusing on the existence of individual equilibriums, we…
We study a Dynamic Programming Principle related to the $p$-Laplacian for $1 < p < \infty$. The main results are existence, uniqueness and continuity of solutions.
We study existence and uniqueness of the fixed points solutions of a large class of non-linear variable discounted transfer operators associated to a sequential decision-making process. We establish regularity properties of these solutions,…
This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above…
We study the dynamic programming approach to revenue management in the context of attended home delivery. We draw on results from dynamic programming theory for Markov decision problems to show that the underlying Bellman operator has a…
We describe an abstract control-theoretic framework in which the validity of the dynamic programming principle can be established in continuous time by a verification of a small number of structural properties. As an application we treat…
This paper is concerned with two-person dynamic zero-sum games. Let games for some family have common dynamics, running costs and capabilities of players, and let these games differ in densities only. We show that the Dynamic Programming…
We consider killed Markov decision processes for countable models on a finite time-interval. Existence of a uniform $\varepsilon$-optimal policy is proven. We show the correctness of the fundamental equation. The optimal control problem is…
We provide an alternative approach to the existence of solutions to dynamic programming equations arising in the discrete game-theoretic interpretations for various nonlinear partial differential equations including the infinity Laplacian,…
We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value…