Related papers: Opponent Aware Reinforcement Learning

Reinforcement Learning under Threats

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which…

Machine Learning · Computer Science 2019-10-28 Victor Gallego , Roi Naveiro , David Rios Insua

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

Efficient Policy Learning for Non-Stationary MDPs under Adversarial Manipulation

A Markov Decision Process (MDP) is a popular model for reinforcement learning. However, its commonly used assumption of stationary dynamics and rewards is too stringent and fails to hold in adversarial, nonstationary, or multi-agent…

Machine Learning · Computer Science 2019-08-22 Tiancheng Yu , Suvrit Sra

Meta reinforcement learning as task inference

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes…

Machine Learning · Computer Science 2019-10-23 Jan Humplik , Alexandre Galashov , Leonard Hasenclever , Pedro A. Ortega , Yee Whye Teh , Nicolas Heess

Camouflage Adversarial Attacks on Multiple Agent Systems

The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various…

Multiagent Systems · Computer Science 2024-02-01 Ziqing Lu , Guanlin Liu , Lifeng Lai , Weiyu Xu

Learn to Change the World: Multi-level Reinforcement Learning with Model-Changing Actions

Reinforcement learning usually assumes a given or sometimes even fixed environment in which an agent seeks an optimal policy to maximize its long-term discounted reward. In contrast, we consider agents that are not limited to passive…

Machine Learning · Computer Science 2025-10-20 Ziqing Lu , Babak Hassibi , Lifeng Lai , Weiyu Xu

Real-Time Reinforcement Learning

Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action…

Machine Learning · Computer Science 2019-12-13 Simon Ramstedt , Christopher Pal

Reinforcement Learning with Partially Known World Dynamics

Reinforcement learning would enjoy better success on real-world problems if domain knowledge could be imparted to the algorithm by the modelers. Most problems have both hidden state and unknown dynamics. Partially observable Markov decision…

Machine Learning · Computer Science 2013-01-07 Christian R. Shelton

Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach

Reinforcement learning (RL) for the Markov Decision Process (MDP) has emerged in many security-related applications, such as autonomous driving, financial decisions, and drone/robot algorithms. In order to improve the robustness/defense of…

Machine Learning · Computer Science 2025-10-16 Ziqing Lu , Lifeng Lai , Weiyu Xu

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber Agents

Reinforcement Learning (RL) agents are increasingly used to simulate sophisticated cyberattacks, but their decision-making processes remain opaque, hindering trust, debugging, and defensive preparedness. In high-stakes cybersecurity…

Cryptography and Security · Computer Science 2026-05-18 Diksha Goel , Kristen Moore , Jeff Wang , Minjune Kim , Thanh Thi Nguyen

Model-based Reinforcement Learning: A Survey

Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a important challenge in artificial intelligence. Two key approaches to this problem are reinforcement learning (RL) and planning. This paper…

Machine Learning · Computer Science 2022-04-01 Thomas M. Moerland , Joost Broekens , Aske Plaat , Catholijn M. Jonker

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable…

Artificial Intelligence · Computer Science 2023-04-20 Soichiro Nishimori , Sotetsu Koyamada , Shin Ishii

Reinforcement Learning with a Terminator

We present the problem of reinforcement learning with exogenous termination. We define the Termination Markov Decision Process (TerMDP), an extension of the MDP framework, in which episodes may be interrupted by an external non-Markovian…

Machine Learning · Computer Science 2023-10-09 Guy Tennenholtz , Nadav Merlis , Lior Shani , Shie Mannor , Uri Shalit , Gal Chechik , Assaf Hallak , Gal Dalal

Balancing detectability and performance of attacks on the control channel of Markov Decision Processes

We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs). This research is motivated by the recent interest of the research community for adversarial and poisoning…

Systems and Control · Electrical Eng. & Systems 2021-09-16 Alessio Russo , Alexandre Proutiere

Delay-Aware Model-Based Reinforcement Learning for Continuous Control

Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented…

Machine Learning · Computer Science 2021-05-10 Baiming Chen , Mengdi Xu , Liang Li , Ding Zhao

Learning Markov State Abstractions for Deep Reinforcement Learning

A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state…

Machine Learning · Computer Science 2024-03-18 Cameron Allen , Neev Parikh , Omer Gottesman , George Konidaris

Feature Markov Decision Processes

General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes…

Artificial Intelligence · Computer Science 2009-12-30 Marcus Hutter

Monitored Markov Decision Processes

In reinforcement learning (RL), an agent learns to perform a task by interacting with an environment and receiving feedback (a numerical reward) for its actions. However, the assumption that rewards are always observable is often not…

Machine Learning · Computer Science 2024-02-15 Simone Parisi , Montaser Mohammedalamen , Alireza Kazemipour , Matthew E. Taylor , Michael Bowling

Optimal Attack and Defense for Reinforcement Learning

To ensure the usefulness of Reinforcement Learning (RL) in real systems, it is crucial to ensure they are robust to noise and adversarial attacks. In adversarial RL, an external attacker has the power to manipulate the victim agent's…

Machine Learning · Computer Science 2024-06-18 Jeremy McMahan , Young Wu , Xiaojin Zhu , Qiaomin Xie