Related papers: Learning and Solving Regular Decision Processes

Online Learning of Non-Markovian Reward Models

There are situations in which an agent should receive rewards only after having accomplished a series of previous tasks, that is, rewards are non-Markovian. One natural and quite general way to represent history-dependent rewards is via a…

Artificial Intelligence · Computer Science 2020-10-01 Gavin Rens , Jean-François Raskin , Raphaël Reynouad , Giuseppe Marra

Learning Non-Markovian Reward Models in MDPs

There are situations in which an agent should receive rewards only after having accomplished a series of previous tasks. In other words, the reward that the agent receives is non-Markovian. One natural and quite general way to represent…

Artificial Intelligence · Computer Science 2020-01-28 Gavin Rens , Jean-François Raskin

Reinforcement Learning with Non-Markovian Rewards

The standard RL world model is that of a Markov Decision Process (MDP). A basic premise of MDPs is that the rewards depend on the last state and action only. Yet, many real-world rewards are non-Markovian. For example, a reward for bringing…

Artificial Intelligence · Computer Science 2019-12-06 Maor Gaon , Ronen I. Brafman

Twice regularized MDPs and the equivalence between robustness and regularization

Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods. However, this significantly increases computational complexity and…

Machine Learning · Computer Science 2021-10-14 Esther Derman , Matthieu Geist , Shie Mannor

Robust Markov Decision Processes: A Place Where AI and Formal Methods Meet

Markov decision processes (MDPs) are a standard model for sequential decision-making problems and are widely used across many scientific areas, including formal methods and artificial intelligence (AI). MDPs do, however, come with the…

Artificial Intelligence · Computer Science 2024-12-11 Marnix Suilen , Thom Badings , Eline M. Bovy , David Parker , Nils Jansen

Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization

Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods. However, this significantly increases computational complexity and…

Machine Learning · Computer Science 2023-03-14 Esther Derman , Yevgeniy Men , Matthieu Geist , Shie Mannor

Omega-Regular Decision Processes

Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct…

Logic in Computer Science · Computer Science 2023-12-15 Ernst Moritz Hahn , Mateo Perez , Sven Schewe , Fabio Somenzi , Ashutosh Trivedi , Dominik Wojtczak

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Solving Robust Markov Decision Processes: Generic, Reliable, Efficient

Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In robust MDP (RMDP), every action is associated with an uncertainty set of probability distributions, modelling…

Artificial Intelligence · Computer Science 2024-12-16 Tobias Meggendorfer , Maximilian Weininger , Patrick Wienhöft

Decision-Theoretic Planning with non-Markovian Rewards

A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more…

Artificial Intelligence · Computer Science 2011-09-13 C. Gretton , F. Kabanza , D. Price , J. Slaney , S. Thiebaux

Solving Long-run Average Reward Robust MDPs via Stochastic Games

Markov decision processes (MDPs) provide a standard framework for sequential decision making under uncertainty. However, MDPs do not take uncertainty in transition probabilities into account. Robust Markov decision processes (RMDPs) address…

Artificial Intelligence · Computer Science 2024-05-01 Krishnendu Chatterjee , Ehsan Kafshdar Goharshady , Mehrdad Karrabi , Petr Novotný , Đorđe Žikelić

Sequential Decision-Making under Uncertainty: A Robust MDPs review

Fueled by advances in both robust optimization theory and reinforcement learning (RL), robust Markov Decision Processes (RMDPs) have garnered increasing attention due to their powerful capability for sequential decision-making under…

Optimization and Control · Mathematics 2025-07-08 Wenfan Ou , Sheng Bi

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

Robust Markov decision processes (MDPs) provide a general framework to model decision problems where the system dynamics are changing or only partially known. Efficient methods for some \texttt{sa}-rectangular robust MDPs exist, using its…

Artificial Intelligence · Computer Science 2022-10-06 Navdeep Kumar , Kfir Levy , Kaixin Wang , Shie Mannor

Robust Anytime Learning of Markov Decision Processes

Markov decision processes (MDPs) are formal models commonly used in sequential decision-making. MDPs capture the stochasticity that may arise, for instance, from imprecise actuators via probabilities in the transition function. However, in…

Artificial Intelligence · Computer Science 2023-06-21 Marnix Suilen , Thiago D. Simão , David Parker , Nils Jansen

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set. By targeting maximal return under the most adversarial model from that set, RMDPs address performance…

Machine Learning · Computer Science 2024-02-13 Uri Gadot , Esther Derman , Navdeep Kumar , Maxence Mohamed Elfatihi , Kfir Levy , Shie Mannor

Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes

Average-reward Markov decision processes (MDPs) provide a foundational framework for sequential decision-making under uncertainty. However, average-reward MDPs have remained largely unexplored in reinforcement learning (RL) settings, with…

Machine Learning · Computer Science 2025-08-29 Juan Sebastian Rojas , Chi-Guhn Lee

A Structure-aware Online Learning Algorithm for Markov Decision Processes

To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider…

Machine Learning · Computer Science 2018-11-29 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

Tractable Offline Learning of Regular Decision Processes

This work studies offline Reinforcement Learning (RL) in a class of non-Markovian environments called Regular Decision Processes (RDPs). In RDPs, the unknown dependency of future observations and rewards from the past interactions can be…

Machine Learning · Computer Science 2024-09-05 Ahana Deb , Roberto Cipollone , Anders Jonsson , Alessandro Ronca , Mohammad Sadegh Talebi

Implementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards

This paper examines a number of solution methods for decision processes with non-Markovian rewards (NMRDPs). They all exploit a temporal logic specification of the reward function to automatically translate the NMRDP into an equivalent…

Artificial Intelligence · Computer Science 2012-12-12 Charles Gretton , David Price , Sylvie Thiebaux

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar