Related papers: Stochastic dynamic programming with non-linear dis…

Discounted Continuous-time Markov Decision Processes with Unbounded Rates: the Dynamic Programming Approach

This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above…

Optimization and Control · Mathematics 2011-03-02 Alexey Piunovskiy , Yi Zhang

Markov Decision Processes with Recursive Risk Measures

In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost under both finite and infinite planning horizons. Our optimality criterion is based on the recursive…

Optimization and Control · Mathematics 2025-10-16 Nicole Bäuerle , Alexander Glauner

An axiomatic approach to Markov decision processes

This paper presents an axiomatic approach to finite Markov decision processes where the discount rate is zero. One of the principal difficulties in the no discounting case is that, even if attention is restricted to stationary policies, a…

Optimization and Control · Mathematics 2022-11-23 Adam Jonsson

On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs

We consider the linear programming approach for constrained and unconstrained Markov decision processes (MDPs) under the long-run average cost criterion, where the class of MDPs in our study have Borel state spaces and discrete countable…

Optimization and Control · Mathematics 2021-04-20 Huizhen Yu

Gradient-Bounded Dynamic Programming for Submodular and Concave Extensible Value Functions with Probabilistic Performance Guarantees

We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time…

Optimization and Control · Mathematics 2020-06-05 Denis Lebedev , Paul Goulart , Kostas Margellos

Stochastic dynamic programming under recursive Epstein-Zin preferences

This paper investigates discrete-time Markov decision processes with recursive utilities (or payoffs) defined by the classic CES aggregator and the Kreps-Porteus certainty equivalent operator. According to the classification introduced by…

Optimization and Control · Mathematics 2025-07-11 Anna Jaśkiewicz , Andrzej S. Nowak

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

We consider infinite-horizon $\gamma$-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. We consider the algorithm Value Iteration and the sequence of policies $\pi_1,...,\pi_k$ it…

Artificial Intelligence · Computer Science 2012-04-02 Bruno Scherrer

Applications of Variable Discounting Dynamic Programming to Iterated Function Systems and Related Problems

We study existence and uniqueness of the fixed points solutions of a large class of non-linear variable discounted transfer operators associated to a sequential decision-making process. We establish regularity properties of these solutions,…

Dynamical Systems · Mathematics 2019-02-20 L. Cioletti , Elismar R. Oliveira

Policy Evaluation in Continuous MDPs with Efficient Kernelized Gradient Temporal Difference

We consider policy evaluation in infinite-horizon discounted Markov decision problems (MDPs) with infinite spaces. We reformulate this task a compositional stochastic program with a function-valued decision variable that belongs to a…

Optimization and Control · Mathematics 2020-05-19 Alec Koppel , Garrett Warnell , Ethan Stump , Peter Stone , Alejandro Ribeiro

Constrained discounted Markov decision processes with Borel state spaces

We study discrete-time discounted constrained Markov decision processes (CMDPs) on Borel spaces with unbounded reward functions. In our approach the transition probability functions are weakly or set-wise continuous. The reward functions…

Optimization and Control · Mathematics 2019-03-29 Eugene A. Feinberg , Anna Jaśkiewicz , Andrzej S. Nowak

On risk-sensitive piecewise deterministic Markov decision processes

We consider a piecewise deterministic Markov decision process, where the expected exponential utility of total (nonnegative) cost is to be minimized. The cost rate, transition rate and post-jump distributions are under control. The state…

Optimization and Control · Mathematics 2017-11-22 Xin Guo , Yi Zhang

A Reinforcement Learning Approach to the Stochastic Cutting Stock Problem

We propose a formulation of the stochastic cutting stock problem as a discounted infinite-horizon Markov decision process. At each decision epoch, given current inventory of items, an agent chooses in which patterns to cut objects in stock…

Optimization and Control · Mathematics 2022-06-29 Anselmo R. Pitombeira-Neto , Arthur H. Fonseca Murta

Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies

We consider approximate dynamic programming for the infinite-horizon stationary $\gamma$-discounted optimal control problem formalized by Markov Decision Processes. While in the exact case it is known that there always exists an optimal…

Optimization and Control · Mathematics 2013-04-23 Boris Lesner , Bruno Scherrer

Reinforcement Learning for Joint Optimization of Multiple Rewards

Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require…

Machine Learning · Computer Science 2023-01-10 Mridul Agarwal , Vaneet Aggarwal

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…

Optimization and Control · Mathematics 2018-10-05 Joseph Warrington , Paul N. Beuchat , John Lygeros

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Running Time

We propose a novel randomized linear programming algorithm for approximating the optimal policy of the discounted Markov decision problem. By leveraging the value-policy duality and binary-tree data structures, the algorithm adaptively…

Optimization and Control · Mathematics 2019-06-04 Mengdi Wang

Dynamic Programming with State-Dependent Discounting

This paper extends the core results of discrete time infinite horizon dynamic programming to the case of state-dependent discounting. We obtain a condition on the discount factor process under which all of the standard optimality results…

General Economics · Economics 2020-10-15 John Stachurski , Junnan Zhang

A recursion-free functional approximation for the dynamic inventory problem

We consider the dynamic inventory problem with non-stationary demands. It has long been known that non-stationary (s, S) policies are optimal for this problem. However, finding optimal policy parameters remains a computational challenge as…

Optimization and Control · Mathematics 2020-07-20 Onur A. Kilic , S. Armagan Tarim

Stochastic resetting induces quantum non-Markovianity

Stochastic resetting describes dynamics which are reinitialized to a reference state at random times. These protocols are attracting significant interest: they can stabilize nonequilibrium stationary states, generate correlations in…

Quantum Physics · Physics 2026-01-21 Federico Carollo , Sascha Wald

Gradient-Bounded Dynamic Programming with Submodular and Concave Extensible Value Functions

We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value…

Optimization and Control · Mathematics 2020-05-25 Denis Lebedev , Paul Goulart , Kostas Margellos