Related papers: A Hybrid PAC Reinforcement Learning Algorithm

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that…

Machine Learning · Computer Science 2021-08-18 Somjit Nath , Mayank Baranwal , Harshad Khadilkar

Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization

Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments. The original Q-learning suffers from performance and complexity challenges across very large networks. Herein,…

Machine Learning · Computer Science 2024-09-02 Talha Bozkus , Urbashi Mitra

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

We study a primal-dual (PD) reinforcement learning (RL) algorithm for online constrained Markov decision processes (CMDPs). Despite its widespread practical use, the existing theoretical literature on PD-RL algorithms for this problem only…

Machine Learning · Computer Science 2024-07-02 Toshinori Kitamura , Tadashi Kozuno , Masahiro Kato , Yuki Ichihara , Soichiro Nishimori , Akiyoshi Sannai , Sho Sonoda , Wataru Kumagai , Yutaka Matsuo

A generalized stacked reinforcement learning method for sampled systems

A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the environment is a stochastic discrete-time dynamical system. Whereas MDPs are suitable in such applications as video-games or puzzles, physical…

Robotics · Computer Science 2022-11-29 Pavel Osinenko , Dmitrii Dobriborsci , Grigory Yaremenko , Georgiy Malaniya

Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs

Hybrid Reinforcement Learning (RL), where an agent learns from both an offline dataset and online explorations in an unknown environment, has garnered significant recent interest. A crucial question posed by Xie et al. (2022) is whether…

Machine Learning · Statistics 2024-08-09 Kevin Tan , Wei Fan , Yuting Wei

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic…

Machine Learning · Computer Science 2020-04-17 Colin Bellinger , Rory Coles , Mark Crowley , Isaac Tamblyn

Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies

In this paper we consider the basic version of Reinforcement Learning (RL) that involves computing optimal data driven (adaptive) policies for Markovian decision process with unknown transition probabilities. We provide a brief survey of…

Machine Learning · Computer Science 2019-09-16 Wesley Cowan , Michael N. Katehakis , Daniel Pirutinsky

On Practical Robust Reinforcement Learning: Practical Uncertainty Set and Double-Agent Algorithm

Robust reinforcement learning (RRL) aims at seeking a robust policy to optimize the worst case performance over an uncertainty set of Markov decision processes (MDPs). This set contains some perturbed MDPs from a nominal MDP (N-MDP) that…

Machine Learning · Computer Science 2023-11-21 Ukjo Hwang , Songnam Hong

Reconciling Discrete-Time Mixed Policies and Continuous-Time Relaxed Controls in Reinforcement Learning and Stochastic Control

Reinforcement learning (RL) is currently one of the most prominent methods for optimizing dynamical systems, with breakthrough results across various fields. The framework is based on the concept of a Markov decision process (MDP), leading…

Optimization and Control · Mathematics 2025-11-17 Rene Carmona , Mathieu Lauriere

Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial

In this paper, a review of model-free reinforcement learning for learning of dynamical systems in uncertain environments has discussed. For this purpose, the Markov Decision Process (MDP) will be reviewed. Furthermore, some learning…

Machine Learning · Computer Science 2019-05-21 Mehran Attar , Mohammadreza Dabirian

A reinforcement learning approach to hybrid control design

In this paper we design hybrid control policies for hybrid systems whose mathematical models are unknown. Our contributions are threefold. First, we propose a framework for modelling the hybrid control design problem as a single Markov…

Systems and Control · Electrical Eng. & Systems 2020-09-03 Meet Gandhi , Atreyee Kundu , Shalabh Bhatnagar

Delay-Aware Model-Based Reinforcement Learning for Continuous Control

Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented…

Machine Learning · Computer Science 2021-05-10 Baiming Chen , Mengdi Xu , Liang Li , Ding Zhao

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in…

Machine Learning · Computer Science 2024-11-01 Jia Lin Hau , Erick Delage , Esther Derman , Mohammad Ghavamzadeh , Marek Petrik

GAN Q-learning

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can…

Machine Learning · Statistics 2018-07-24 Thang Doan , Bogdan Mazoure , Clare Lyle

Model-based Reinforcement Learning: A Survey

Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a important challenge in artificial intelligence. Two key approaches to this problem are reinforcement learning (RL) and planning. This paper…

Machine Learning · Computer Science 2022-04-01 Thomas M. Moerland , Joost Broekens , Aske Plaat , Catholijn M. Jonker

Model-Based Reinforcement Learning in Discrete-Action Non-Markovian Reward Decision Processes

Many practical decision-making problems involve tasks whose success depends on the entire system history, rather than on achieving a state with desired properties. Markovian Reinforcement Learning (RL) approaches are not suitable for such…

Machine Learning · Computer Science 2025-12-17 Alessandro Trapasso , Luca Iocchi , Fabio Patrizi

$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of…

Machine Learning · Statistics 2015-06-04 Soummya Kar , Jose' M. F. Moura , H. Vincent Poor

PAC Reinforcement Learning Algorithm for General-Sum Markov Games

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. The paper offers an extension to the well-known Nash Q-learning algorithm, using the…

Computer Science and Game Theory · Computer Science 2020-09-09 Ashkan Zehfroosh , Herbert G. Tanner

Safe Reinforcement Learning via Probabilistic Shields

This paper targets the efficient construction of a safety shield for decision making in scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent models to capture such planning problems. Reinforcement learning…

Artificial Intelligence · Computer Science 2019-11-26 Nils Jansen , Bettina Könighofer , Sebastian Junges , Alexandru C. Serban , Roderick Bloem