Related papers: An Approximate Dynamic Programming Algorithm for M…
Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods- distributionally robust ADP-that address the curse of…
In this paper, we give a new approximate dynamic programming (ADP) method to solve large-scale Markov decision programming (MDP) problem. In comparison with many classic ADP methods which have large number of constraints, we formulate an…
This paper studies a finite-horizon Markov decision problem with information-theoretic constraints, where the goal is to minimize directed information from the controlled source process to the control process, subject to stage-wise cost…
We develop a new Approximate Dynamic Programming (ADP) method for infinite horizon discounted reward Markov Decision Processes (MDP) based on projection onto a subsemimodule. We approximate the value function in terms of a $(\min,+)$ linear…
Markov Decision Processes (MDP) is an useful framework to cast optimal sequential decision making problems. Given any MDP the aim is to find the optimal action selection mechanism i.e., the optimal policy. Typically, the optimal policy…
In this paper, we propose an approximate dynamic programming (ADP) algorithm to solve a Markov decision process (MDP) formulation for the admission control of elective patients. To manage the elective patients from multiple specialties…
Approximate Dynamic Programming (ADP) is a methodology to solve multi-stage stochastic optimization problems in multi-dimensional discrete or continuous spaces. ADP approximates the optimal value function by adaptively sampling both action…
There is a growing interest in using electric vehicles (EVs) and drones for many applications. However, battery-oriented issues, including range anxiety and battery degradation, impede adoption. Battery swap stations are one alternative to…
This study is aimed at answering the famous question of how the approximation errors at each iteration of Approximate Dynamic Programming (ADP) affect the quality of the final results considering the fact that errors at each iteration…
Equipping approximate dynamic programming (ADP) with inputconstraints has a tremendous significance. This enables ADP to be applied tothe systems with actuator limitations, which is quite common for dynamicalsystems. In a conventional…
We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…
We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be…
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural `projection' of a…
Reinforcement learning based adaptive/approximate dynamic programming (ADP) is a powerful technique to determine an approximate optimal controller for a dynamical system. These methods bypass the need to analytically solve the nonlinear…
Markov decision processes (MDPs) are used to model stochastic systems in many applications. Several efficient algorithms to compute optimal policies have been studied in the literature, including value iteration (VI) and policy iteration.…
In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic…
Value iteration is a well-known method of solving Markov Decision Processes (MDPs) that is simple to implement and boasts strong theoretical convergence guarantees. However, the computational cost of value iteration quickly becomes…
One of the most widely used methods for solving average cost MDP problems is the value iteration method. This method, however, is often computationally impractical and restricted in size of solvable MDP problems. We propose acceleration…
In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…
Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to…