Related papers: Deceptive Sequential Decision-Making via Regulariz…

Deception in Optimal Control

In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective. The agent has an incentive to try…

Optimization and Control · Mathematics 2018-05-09 Melkior Ornik , Ufuk Topcu

Reward-Based Deception with Cognitive Bias

Deception plays a key role in adversarial or strategic interactions for the purpose of self-defence and survival. This paper introduces a general framework and solution to address deception. Most existing approaches for deception consider…

Artificial Intelligence · Computer Science 2019-04-26 Bo Wu , Murat Cubuktepe , Suda Bharadwaj , Ufuk Topcu

Deception in Supervisory Control

The use of deceptive strategies is important for an agent that attempts not to reveal his intentions in an adversarial environment. We consider a setting in which a supervisor provides a reference policy and expects an agent to follow the…

Optimization and Control · Mathematics 2023-01-04 Mustafa O. Karabag , Melkior Ornik , Ufuk Topcu

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Detection of malicious behavior is a fundamental problem in security. One of the major challenges in using detection systems in practice is in dealing with an overwhelming number of alerts that are triggered by normal behavior (the…

Cryptography and Security · Computer Science 2019-06-24 Liang Tong , Aron Laszka , Chao Yan , Ning Zhang , Yevgeniy Vorobeychik

Adversarial Imitation via Variational Inverse Reinforcement Learning

We consider a problem of learning the reward and policy from expert examples under unknown dynamics. Our proposed method builds on the framework of generative adversarial networks and introduces the empowerment-regularized maximum-entropy…

Machine Learning · Computer Science 2019-02-26 Ahmed H. Qureshi , Byron Boots , Michael C. Yip

Markov Decision Processes of the Third Kind: Learning Distributions by Policy Gradient Descent

The goal of this paper is to analyze distributional Markov Decision Processes as a class of control problems in which the objective is to learn policies that steer the distribution of a cumulative reward toward a prescribed target law,…

Optimization and Control · Mathematics 2026-02-09 Nicole Bäuerle , Athanasios Vasileiadis

Deceptive Reinforcement Learning for Privacy-Preserving Planning

In this paper, we study the problem of deceptive reinforcement learning to preserve the privacy of a reward function. Reinforcement learning is the problem of finding a behaviour policy based on rewards received from exploratory behaviour.…

Machine Learning · Computer Science 2021-02-08 Zhengshang Liu , Yue Yang , Tim Miller , Peta Masters

Regularizing Adversarial Imitation Learning Using Causal Invariance

Imitation learning methods are used to infer a policy in a Markov decision process from a dataset of expert demonstrations by minimizing a divergence measure between the empirical state occupancy measures of the expert and the policy. The…

Machine Learning · Computer Science 2023-08-21 Ivan Ovinnikov , Joachim M. Buhmann

Deceptive Planning for Resource Allocation

We consider a team of autonomous agents that navigate in an adversarial environment and aim to achieve a task by allocating their resources over a set of target locations. An adversary in the environment observes the autonomous team's…

Optimization and Control · Mathematics 2023-10-09 Shenghui Chen , Yagiz Savas , Mustafa O. Karabag , Brian M. Sadler , Ufuk Topcu

Learning Causally Invariant Reward Functions from Diverse Demonstrations

Inverse reinforcement learning methods aim to retrieve the reward function of a Markov decision process based on a dataset of expert demonstrations. The commonplace scarcity and heterogeneous sources of such demonstrations can lead to the…

Machine Learning · Computer Science 2024-09-13 Ivan Ovinnikov , Eugene Bykovets , Joachim M. Buhmann

Deceptive Decision-Making Under Uncertainty

We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agent's behavior as a Markov decision process, we…

Artificial Intelligence · Computer Science 2021-09-15 Yagiz Savas , Christos K. Verginis , Ufuk Topcu

Deception Against Data-Driven Linear-Quadratic Control

Deception is a common defense mechanism against adversaries with an information disadvantage. It can force such adversaries to select suboptimal policies for a defender's benefit. We consider a setting where an adversary tries to learn the…

Systems and Control · Electrical Eng. & Systems 2026-02-20 Filippos Fotiadis , Aris Kanellopoulos , Kyriakos G. Vamvoudakis , Ufuk Topcu

Differentially Private Reward Functions in Policy Synthesis for Markov Decision Processes

Markov decision processes often seek to maximize a reward function, but onlookers may infer reward functions by observing the states and actions of such systems, revealing sensitive information. Therefore, in this paper we introduce and…

Systems and Control · Electrical Eng. & Systems 2024-09-04 Alexander Benvenuti , Calvin Hawkins , Brandon Fallin , Bo Chen , Brendan Bialy , Miriam Dennis , Matthew Hale

Preserving the Privacy of Reward Functions in MDPs through Deception

Preserving the privacy of preferences (or rewards) of a sequential decision-making agent when decisions are observable is crucial in many physical and cybersecurity domains. For instance, in wildlife monitoring, agents must allocate…

Artificial Intelligence · Computer Science 2024-07-16 Shashank Reddy Chirra , Pradeep Varakantham , Praveen Paruchuri

Reinforcement Learning of Sequential Price Mechanisms

We introduce the use of reinforcement learning for indirect mechanisms, working with the existing class of sequential price mechanisms, which generalizes both serial dictatorship and posted price mechanisms and essentially characterizes all…

Computer Science and Game Theory · Computer Science 2021-05-07 Gianluca Brero , Alon Eden , Matthias Gerstgrasser , David C. Parkes , Duncan Rheingans-Yoo

\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

Decentralized machine learning often relies on outsourcing computations, such as gradient evaluations, to untrusted worker nodes. Existing robust aggregation methods can mitigate malicious behavior under honest-majority assumptions, but may…

Machine Learning · Computer Science 2026-05-11 Hanzaleh Akbari Nodehi , Parsa Moradi , Soheil Mohajer , Mohammad Ali Maddah-Ali

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

Deep reinforcement learning models are vulnerable to adversarial attacks that can decrease a victim's cumulative expected reward by manipulating the victim's observations. Despite the efficiency of previous optimization-based methods for…

Machine Learning · Computer Science 2023-02-28 You Qiaoben , Chengyang Ying , Xinning Zhou , Hang Su , Jun Zhu , Bo Zhang

Towards an Understanding of Default Policies in Multitask Policy Optimization

Much of the recent success of deep reinforcement learning has been driven by regularized policy optimization (RPO) algorithms with strong performance across multiple domains. In this family of methods, agents are trained to maximize…

Machine Learning · Computer Science 2022-03-24 Ted Moskovitz , Michael Arbel , Jack Parker-Holder , Aldo Pacchiano

A unified view of entropy-regularized Markov decision processes

We propose a general framework for entropy-regularized average-reward reinforcement learning in Markov decision processes (MDPs). Our approach is based on extending the linear-programming formulation of policy optimization in MDPs to…

Machine Learning · Computer Science 2017-05-23 Gergely Neu , Anders Jonsson , Vicenç Gómez

A Theory of Regularized Markov Decision Processes

Many recent successful (deep) reinforcement learning algorithms make use of regularization, generally based on entropy or Kullback-Leibler divergence. We propose a general theory of regularized Markov Decision Processes that generalizes…

Machine Learning · Computer Science 2019-06-05 Matthieu Geist , Bruno Scherrer , Olivier Pietquin