Related papers: A Dynamic Programming Approach to Evaluating Multi…
We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be…
Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods- distributionally robust ADP-that address the curse of…
We consider a broad class of dynamic programming (DP) problems that involve a partially linear structure and some positivity properties in their system equation and cost function. We address deterministic and stochastic problems, possibly…
We propose a new method for optimistic planning in infinite-horizon discounted Markov decision processes based on the idea of adding regularization to the updates of an otherwise standard approximate value iteration procedure. This…
We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…
We describe an approximate dynamic programming approach to compute lower bounds on the optimal value function for a discrete time, continuous space, infinite horizon setting. The approach iteratively constructs a family of lower bounding…
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural `projection' of a…
In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and…
Markov Decision Processes (MDP) is an useful framework to cast optimal sequential decision making problems. Given any MDP the aim is to find the optimal action selection mechanism i.e., the optimal policy. Typically, the optimal policy…
Many sequential decision problems can be formulated as Markov Decision Processes (MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a monotone structure in some or all of its dimensions. When the state…
We consider the problem of estimating a probability distribution that maximizes the entropy while satisfying a finite number of moment constraints, possibly corrupted by noise. Based on duality of convex programming, we present a novel…
We consider large-scale Markov decision processes (MDPs) with parameter uncertainty, under the robust MDP paradigm. Previous studies showed that robust MDPs, based on a minimax approach to handle uncertainty, can be solved using dynamic…
In this paper, we develop approximate dynamic programming methods for stochastic systems modeled as Markov Decision Processes, given both soft performance criteria and hard constraints in a class of probabilistic temporal logic called…
We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that…
Approximate Newton methods are a standard optimization tool which aim to maintain the benefits of Newton's method, such as a fast rate of convergence, whilst alleviating its drawbacks, such as computationally expensive calculation or…
In this paper we provide faster algorithms for approximately solving discounted Markov Decision Processes in multiple parameter regimes. Given a discounted Markov Decision Process (DMDP) with $|S|$ states, $|A|$ actions, discount factor…
In this work, we study dynamic programming (DP) algorithms for partially observable Markov decision processes with jointly continuous and discrete state-spaces. We consider a class of stochastic systems which have coupled discrete and…
In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs). For this purpose, some variant of Stochastic Mirror Descent is proposed for convex programming problems with Lipschitz-continuous…
In this paper, we give a new approximate dynamic programming (ADP) method to solve large-scale Markov decision programming (MDP) problem. In comparison with many classic ADP methods which have large number of constraints, we formulate an…