Related papers: Feature Markov Decision Processes

Feature Reinforcement Learning: Part I: Unstructured MDPs

General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small…

Machine Learning · Computer Science 2009-12-30 Marcus Hutter

Learning Markov State Abstractions for Deep Reinforcement Learning

A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state…

Machine Learning · Computer Science 2024-03-18 Cameron Allen , Neev Parikh , Omer Gottesman , George Konidaris

Learning in Markov Decision Processes with Exogenous Dynamics

Reinforcement learning algorithms are typically designed for generic Markov Decision Processes (MDPs), where any state-action pair can lead to an arbitrary transition distribution. In many practical systems, however, only a subset of the…

Machine Learning · Computer Science 2026-03-05 Davide Maran , Davide Salaorni , Marcello Restelli

Learn to Change the World: Multi-level Reinforcement Learning with Model-Changing Actions

Reinforcement learning usually assumes a given or sometimes even fixed environment in which an agent seeks an optimal policy to maximize its long-term discounted reward. In contrast, we consider agents that are not limited to passive…

Machine Learning · Computer Science 2025-10-20 Ziqing Lu , Babak Hassibi , Lifeng Lai , Weiyu Xu

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

Striking a Balance in Fairness for Dynamic Systems Through Reinforcement Learning

While significant advancements have been made in the field of fair machine learning, the majority of studies focus on scenarios where the decision model operates on a static population. In this paper, we study fairness in dynamic systems…

Machine Learning · Computer Science 2024-01-15 Yaowei Hu , Jacob Lear , Lu Zhang

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Minimizing the Outage Probability in a Markov Decision Process

Standard Markov decision process (MDP) and reinforcement learning algorithms optimize the policy with respect to the expected gain. We propose an algorithm which enables to optimize an alternative objective: the probability that the gain is…

Machine Learning · Computer Science 2023-03-06 Vincent Corlay , Jean-Christophe Sibel

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act

Traditionally, Reinforcement Learning (RL) aims at deciding how to act optimally for an artificial agent. We argue that deciding when to act is equally important. As humans, we drift from default, instinctive or memorized behaviors to…

Machine Learning · Computer Science 2022-03-17 Alexis Jacq , Johan Ferret , Olivier Pietquin , Matthieu Geist

Lecture Notes on Partially Known MDPs

In these notes we will tackle the problem of finding optimal policies for Markov decision processes (MDPs) which are not fully known to us. Our intention is to slowly transition from an offline setting to an online (learning) setting.…

Artificial Intelligence · Computer Science 2022-06-22 Guillermo A. Perez

On the Complexity of Solving Markov Decision Problems

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving…

Artificial Intelligence · Computer Science 2013-02-21 Michael L. Littman , Thomas L. Dean , Leslie Pack Kaelbling

A Markov Decision Process Approach to Active Meta Learning

In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast,…

Machine Learning · Computer Science 2020-09-11 Bingjia Wang , Alec Koppel , Vikram Krishnamurthy

Real-Time Reinforcement Learning

Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action…

Machine Learning · Computer Science 2019-12-13 Simon Ramstedt , Christopher Pal

Model-Based Exploration in Monitored Markov Decision Processes

A tenet of reinforcement learning is that the agent always observes rewards. However, this is not true in many realistic settings, e.g., a human observer may not always be available to provide rewards, sensors may be limited or…

Machine Learning · Computer Science 2026-03-24 Alireza Kazemipour , Simone Parisi , Matthew E. Taylor , Michael Bowling

Meta reinforcement learning as task inference

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes…

Machine Learning · Computer Science 2019-10-23 Jan Humplik , Alexandre Galashov , Leonard Hasenclever , Pedro A. Ortega , Yee Whye Teh , Nicolas Heess

Selecting the State-Representation in Reinforcement Learning

The problem of selecting the right state-representation in a reinforcement learning problem is considered. Several models (functions mapping past observations to a finite set) of the observations are given, and it is known that for at least…

Machine Learning · Computer Science 2013-02-12 Odalric-Ambrym Maillard , Rémi Munos , Daniil Ryabko

Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Mean payoff (or long-run average reward) provides a mathematically elegant formalism to express performance related…

Performance · Computer Science 2017-09-08 Jan Křetínský , Tobias Meggendorfer

Verification of Markov Decision Processes using Learning Algorithms

We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the…

Logic in Computer Science · Computer Science 2015-03-31 Tomáš Brázdil , Krishnendu Chatterjee , Martin Chmelík , Vojtěch Forejt , Jan Křetínský , Marta Kwiatkowska , David Parker , Mateusz Ujma

Strategy Synthesis in Markov Decision Processes Under Limited Sampling Access

A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision…

Machine Learning · Computer Science 2023-04-25 Christel Baier , Clemens Dubslaff , Patrick Wienhöft , Stefan J. Kiebel