Related papers: Incentive Decision Processes

Incentive Design for Temporal Logic Objectives

We study the problem of designing an optimal sequence of incentives that a principal should offer to an agent so that the agent's optimal behavior under the incentives realizes the principal's objective expressed as a temporal logic…

Optimization and Control · Mathematics 2019-03-20 Yagiz Savas , Vijay Gupta , Melkior Ornik , Lillian J. Ratliff , Ufuk Topcu

Principal-Agent Reward Shaping in MDPs

Principal-agent problems arise when one party acts on behalf of another, leading to conflicts of interest. The economic literature has extensively studied principal-agent problems, and recent work has extended this to more complex scenarios…

Artificial Intelligence · Computer Science 2024-01-02 Omer Ben-Porat , Yishay Mansour , Michal Moshkovitz , Boaz Taitler

Value-Function Approximations for Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a…

Artificial Intelligence · Computer Science 2011-06-02 M. Hauskrecht

Contracting With a Reinforcement Learning Agent by Playing Trick or Treat

We study principal-agent problems where a farsighted agent takes costly actions in an MDP. The core challenge in these settings is that agent's actions are hidden to the principal, who can only observe their outcomes, namely state…

Computer Science and Game Theory · Computer Science 2024-10-18 Matteo Bollini , Francesco Bacchiocchi , Matteo Castiglioni , Alberto Marchesi , Nicola Gatti

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Extracting Incentives from Black-Box Decisions

An algorithmic decision-maker incentivizes people to act in certain ways to receive better decisions. These incentives can dramatically influence subjects' behaviors and lives, and it is important that both decision-makers and…

Machine Learning · Computer Science 2019-10-15 Yonadav Shavit , William S. Moses

On the Complexity of Sequential Incentive Design

In many scenarios, a principal dynamically interacts with an agent and offers a sequence of incentives to align the agent's behavior with a desired objective. This paper focuses on the problem of synthesizing an incentive sequence that,…

Optimization and Control · Mathematics 2020-07-20 Yagiz Savas , Vijay Gupta , Ufuk Topcu

A Contracting Dynamical System Perspective toward Interval Markov Decision Processes

Interval Markov decision processes are a class of Markov models where the transition probabilities between the states belong to intervals. In this paper, we study the problem of efficient estimation of the optimal policies in Interval…

Systems and Control · Electrical Eng. & Systems 2023-09-19 Saber Jafarpour , Samuel Coogan

Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Mean payoff (or long-run average reward) provides a mathematically elegant formalism to express performance related…

Performance · Computer Science 2017-09-08 Jan Křetínský , Tobias Meggendorfer

Learning from Humans as an I-POMDP

The interactive partially observable Markov decision process (I-POMDP) is a recently developed framework which extends the POMDP to the multi-agent setting by including agent models in the state space. This paper argues for formulating the…

Robotics · Computer Science 2012-04-03 Mark P. Woodward , Robert J. Wood

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Constrained Active Classification Using Partially Observable Markov Decision Processes

In this work, we study the problem of actively classifying the attributes of dynamical systems characterized as a finite set of Markov decision process (MDP) models. We are interested in finding strategies that actively interact with the…

Systems and Control · Electrical Eng. & Systems 2023-01-06 Bo Wu , Niklas Lauffer , Mohamadreza Ahmadi , Suda Bharadwaj , Zhe Xu , Ufuk Topcu

Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding…

Artificial Intelligence · Computer Science 2011-06-02 N. L. Zhang , W. Zhang

Probabilistic inverse reinforcement learning in unknown environments

We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents…

Machine Learning · Computer Science 2014-08-12 Aristide Tossou , Christos Dimitrakakis

Probabilistic inverse reinforcement learning in unknown environments

We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents…

Machine Learning · Statistics 2013-07-16 Aristide C. Y. Tossou , Christos Dimitrakakis

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

Online Markov decision processes with policy iteration

The online Markov decision process (MDP) is a generalization of the classical Markov decision process that incorporates changing reward functions. In this paper, we propose practical online MDP algorithms with policy iteration and…

Machine Learning · Computer Science 2015-10-16 Yao Ma , Hao Zhang , Masashi Sugiyama

Minimizing the Outage Probability in a Markov Decision Process

Standard Markov decision process (MDP) and reinforcement learning algorithms optimize the policy with respect to the expected gain. We propose an algorithm which enables to optimize an alternative objective: the probability that the gain is…

Machine Learning · Computer Science 2023-03-06 Vincent Corlay , Jean-Christophe Sibel

What should be observed for optimal reward in POMDPs?

Partially observable Markov Decision Processes (POMDPs) are a standard model for agents making decisions in uncertain environments. Most work on POMDPs focuses on synthesizing strategies based on the available capabilities. However, system…

Artificial Intelligence · Computer Science 2024-07-12 Alyzia-Maria Konsta , Alberto Lluch Lafuente , Christoph Matheja

Feature Markov Decision Processes

General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes…

Artificial Intelligence · Computer Science 2009-12-30 Marcus Hutter