English
Related papers

Related papers: Feature Markov Decision Processes

200 papers

General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small…

Machine Learning · Computer Science 2009-12-30 Marcus Hutter

A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state…

Machine Learning · Computer Science 2024-03-18 Cameron Allen , Neev Parikh , Omer Gottesman , George Konidaris

Reinforcement learning algorithms are typically designed for generic Markov Decision Processes (MDPs), where any state-action pair can lead to an arbitrary transition distribution. In many practical systems, however, only a subset of the…

Machine Learning · Computer Science 2026-03-05 Davide Maran , Davide Salaorni , Marcello Restelli

Reinforcement learning usually assumes a given or sometimes even fixed environment in which an agent seeks an optimal policy to maximize its long-term discounted reward. In contrast, we consider agents that are not limited to passive…

Machine Learning · Computer Science 2025-10-20 Ziqing Lu , Babak Hassibi , Lifeng Lai , Weiyu Xu

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

While significant advancements have been made in the field of fair machine learning, the majority of studies focus on scenarios where the decision model operates on a static population. In this paper, we study fairness in dynamic systems…

Machine Learning · Computer Science 2024-01-15 Yaowei Hu , Jacob Lear , Lu Zhang

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Standard Markov decision process (MDP) and reinforcement learning algorithms optimize the policy with respect to the expected gain. We propose an algorithm which enables to optimize an alternative objective: the probability that the gain is…

Machine Learning · Computer Science 2023-03-06 Vincent Corlay , Jean-Christophe Sibel

Traditionally, Reinforcement Learning (RL) aims at deciding how to act optimally for an artificial agent. We argue that deciding when to act is equally important. As humans, we drift from default, instinctive or memorized behaviors to…

Machine Learning · Computer Science 2022-03-17 Alexis Jacq , Johan Ferret , Olivier Pietquin , Matthieu Geist

In these notes we will tackle the problem of finding optimal policies for Markov decision processes (MDPs) which are not fully known to us. Our intention is to slowly transition from an offline setting to an online (learning) setting.…

Artificial Intelligence · Computer Science 2022-06-22 Guillermo A. Perez

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving…

Artificial Intelligence · Computer Science 2013-02-21 Michael L. Littman , Thomas L. Dean , Leslie Pack Kaelbling

In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast,…

Machine Learning · Computer Science 2020-09-11 Bingjia Wang , Alec Koppel , Vikram Krishnamurthy

Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action…

Machine Learning · Computer Science 2019-12-13 Simon Ramstedt , Christopher Pal

A tenet of reinforcement learning is that the agent always observes rewards. However, this is not true in many realistic settings, e.g., a human observer may not always be available to provide rewards, sensors may be limited or…

Machine Learning · Computer Science 2026-03-24 Alireza Kazemipour , Simone Parisi , Matthew E. Taylor , Michael Bowling

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes…

Machine Learning · Computer Science 2019-10-23 Jan Humplik , Alexandre Galashov , Leonard Hasenclever , Pedro A. Ortega , Yee Whye Teh , Nicolas Heess

The problem of selecting the right state-representation in a reinforcement learning problem is considered. Several models (functions mapping past observations to a finite set) of the observations are given, and it is known that for at least…

Machine Learning · Computer Science 2013-02-12 Odalric-Ambrym Maillard , Rémi Munos , Daniil Ryabko

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Mean payoff (or long-run average reward) provides a mathematically elegant formalism to express performance related…

Performance · Computer Science 2017-09-08 Jan Křetínský , Tobias Meggendorfer

We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the…

A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision…

Machine Learning · Computer Science 2023-04-25 Christel Baier , Clemens Dubslaff , Patrick Wienhöft , Stefan J. Kiebel
‹ Prev 1 2 3 10 Next ›