Related papers: Kernel Taylor-Based Value Function Approximation f…

Kernel-based diffusion approximated Markov decision processes for autonomous navigation and control on unstructured terrains

We propose a diffusion approximation method to the continuous-state Markov Decision Processes (MDPs) that can be utilized to address autonomous navigation and control in unstructured off-road environments. In contrast to most…

Robotics · Computer Science 2024-02-08 Junhong Xu , Kai Yin , Zheng Chen , Jason M. Gregory , Ethan A. Stump , Lantao Liu

Policy Evaluation in Continuous MDPs with Efficient Kernelized Gradient Temporal Difference

We consider policy evaluation in infinite-horizon discounted Markov decision problems (MDPs) with infinite spaces. We reformulate this task a compositional stochastic program with a function-valued decision variable that belongs to a…

Optimization and Control · Mathematics 2020-05-19 Alec Koppel , Garrett Warnell , Ethan Stump , Peter Stone , Alejandro Ribeiro

Partial Policy Iteration for L1-Robust Markov Decision Processes

Robust Markov decision processes (MDPs) allow to compute reliable solutions for dynamic decision problems whose evolution is modeled by rewards and partially-known transition probabilities. Unfortunately, accounting for uncertainty in the…

Machine Learning · Computer Science 2020-06-18 Chin Pang Ho , Marek Petrik , Wolfram Wiesemann

State-Continuity Approximation of Markov Decision Processes via Finite Element Methods for Autonomous System Planning

Motion planning under uncertainty for an autonomous system can be formulated as a Markov Decision Process with a continuous state space. In this paper, we propose a novel solution to this decision-theoretic planning problem that directly…

Robotics · Computer Science 2020-07-02 Junhong Xu , Kai Yin , Lantao Liu

On Reward Structures of Markov Decision Processes

A Markov decision process can be parameterized by a transition kernel and a reward function. Both play essential roles in the study of reinforcement learning as evidenced by their presence in the Bellman equations. In our inquiry of various…

Machine Learning · Computer Science 2023-09-04 Falcon Z. Dai

Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

We seek to learn an effective policy for a Markov Decision Process (MDP) with continuous states via Q-Learning. Given a set of basis functions over state action pairs we search for a corresponding set of linear weights that minimizes the…

Machine Learning · Computer Science 2013-09-27 Charles Tripp , Ross D. Shachter

Optimal policy evaluation using kernel-based temporal difference methods

We study methods based on reproducing kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process (MRP). We study a regularized form of the kernel least-squares temporal difference (LSTD)…

Machine Learning · Statistics 2021-09-27 Yaqi Duan , Mengdi Wang , Martin J. Wainwright

Efficient Planning in Large MDPs with Weak Linear Function Approximation

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak…

Machine Learning · Computer Science 2020-07-14 Roshan Shariff , Csaba Szepesvári

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

We present an efficient robust value iteration for \texttt{s}-rectangular robust Markov Decision Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which is significantly faster than any existing method. We do…

Machine Learning · Computer Science 2023-02-01 Navdeep Kumar , Kfir Levy , Kaixin Wang , Shie Mannor

An Adaptive State Aggregation Algorithm for Markov Decision Processes

Value iteration is a well-known method of solving Markov Decision Processes (MDPs) that is simple to implement and boasts strong theoretical convergence guarantees. However, the computational cost of value iteration quickly becomes…

Machine Learning · Computer Science 2021-07-26 Guanting Chen , Johann Demetrio Gaebler , Matt Peng , Chunlin Sun , Yinyu Ye

A Primal-Dual Approach to Constrained Markov Decision Processes

In many operations management problems, we need to make decisions sequentially to minimize the cost while satisfying certain constraints. One modeling approach to study such problems is constrained Markov decision process (CMDP). When…

Optimization and Control · Mathematics 2021-01-27 Yi Chen , Jing Dong , Zhaoran Wang

Policy Iteration for Relational MDPs

Relational Markov Decision Processes are a useful abstraction for complex reinforcement learning problems and stochastic planning problems. Recent work developed representation schemes and algorithms for planning in such problems using the…

Artificial Intelligence · Computer Science 2012-06-26 Chenggang Wang , Roni Khardon

Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes

In this paper we provide faster algorithms for approximately solving discounted Markov Decision Processes in multiple parameter regimes. Given a discounted Markov Decision Process (DMDP) with $|S|$ states, $|A|$ actions, discount factor…

Data Structures and Algorithms · Computer Science 2020-12-24 Aaron Sidford , Mengdi Wang , Xian Wu , Yinyu Ye

Polynomial Value Iteration Algorithms for Detrerminstic MDPs

Value iteration is a commonly used and empirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudo-polynomial complexity in general. We establish a somewhat…

Artificial Intelligence · Computer Science 2013-01-07 Omid Madani

Markov Decision Processes with Recursive Risk Measures

In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost under both finite and infinite planning horizons. Our optimality criterion is based on the recursive…

Optimization and Control · Mathematics 2025-10-16 Nicole Bäuerle , Alexander Glauner

Bayesian Risk-Sensitive Policy Optimization For MDPs With General Loss Functions

Motivated by many application problems, we consider Markov decision processes (MDPs) with a general loss function and unknown parameters. To mitigate the epistemic uncertainty associated with unknown parameters, we take a Bayesian approach…

Machine Learning · Computer Science 2025-10-02 Xiaoshuang Wang , Yifan Lin , Enlu Zhou

Rank-One Modified Value Iteration

In this paper, we provide a novel algorithm for solving planning and learning problems of Markov decision processes. The proposed algorithm follows a policy iteration-type update by using a rank-one approximation of the transition…

Optimization and Control · Mathematics 2025-10-23 Arman Sharifi Kolarijani , Tolga Ok , Peyman Mohajerin Esfahani , Mohamad Amin Sharif Kolarijani

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Dynamic Programming Through the Lens of Semismooth Newton-Type Methods (Extended Version)

Policy iteration and value iteration are at the core of many (approximate) dynamic programming methods. For Markov Decision Processes with finite state and action spaces, we show that they are instances of semismooth Newton-type methods to…

Optimization and Control · Mathematics 2022-06-28 Matilde Gargiani , Andrea Zanelli , Dominic Liao-McPherson , Tyler Summers , John Lygeros

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese