Related papers: Stochastic dynamic programming with non-linear dis…
This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above…
In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost under both finite and infinite planning horizons. Our optimality criterion is based on the recursive…
This paper presents an axiomatic approach to finite Markov decision processes where the discount rate is zero. One of the principal difficulties in the no discounting case is that, even if attention is restricted to stationary policies, a…
We consider the linear programming approach for constrained and unconstrained Markov decision processes (MDPs) under the long-run average cost criterion, where the class of MDPs in our study have Borel state spaces and discrete countable…
We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time…
This paper investigates discrete-time Markov decision processes with recursive utilities (or payoffs) defined by the classic CES aggregator and the Kreps-Porteus certainty equivalent operator. According to the classification introduced by…
We consider infinite-horizon $\gamma$-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. We consider the algorithm Value Iteration and the sequence of policies $\pi_1,...,\pi_k$ it…
We study existence and uniqueness of the fixed points solutions of a large class of non-linear variable discounted transfer operators associated to a sequential decision-making process. We establish regularity properties of these solutions,…
We consider policy evaluation in infinite-horizon discounted Markov decision problems (MDPs) with infinite spaces. We reformulate this task a compositional stochastic program with a function-valued decision variable that belongs to a…
We study discrete-time discounted constrained Markov decision processes (CMDPs) on Borel spaces with unbounded reward functions. In our approach the transition probability functions are weakly or set-wise continuous. The reward functions…
We consider a piecewise deterministic Markov decision process, where the expected exponential utility of total (nonnegative) cost is to be minimized. The cost rate, transition rate and post-jump distributions are under control. The state…
We propose a formulation of the stochastic cutting stock problem as a discounted infinite-horizon Markov decision process. At each decision epoch, given current inventory of items, an agent chooses in which patterns to cut objects in stock…
We consider approximate dynamic programming for the infinite-horizon stationary $\gamma$-discounted optimal control problem formalized by Markov Decision Processes. While in the exact case it is known that there always exists an optimal…
Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require…
We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…
We propose a novel randomized linear programming algorithm for approximating the optimal policy of the discounted Markov decision problem. By leveraging the value-policy duality and binary-tree data structures, the algorithm adaptively…
This paper extends the core results of discrete time infinite horizon dynamic programming to the case of state-dependent discounting. We obtain a condition on the discount factor process under which all of the standard optimality results…
We consider the dynamic inventory problem with non-stationary demands. It has long been known that non-stationary (s, S) policies are optimal for this problem. However, finding optimal policy parameters remains a computational challenge as…
Stochastic resetting describes dynamics which are reinitialized to a reference state at random times. These protocols are attracting significant interest: they can stabilize nonequilibrium stationary states, generate correlations in…
We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value…