English
Related papers

Related papers: Improper Reinforcement Learning with Gradient-base…

200 papers

We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform…

Machine Learning · Computer Science 2022-07-20 Mohammadi Zaki , Avinash Mohan , Aditya Gopalan , Shie Mannor

In this paper we propose a novel gradient algorithm to learn a policy from an expert's observed behavior assuming that the expert behaves optimally with respect to some unknown reward function of a Markovian Decision Problem. The…

Machine Learning · Computer Science 2012-06-26 Gergely Neu , Csaba Szepesvari

Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behavior, but we also observe part of her…

Machine Learning · Computer Science 2021-09-03 Giorgia Ramponi , Gianluca Drappo , Marcello Restelli

Using the policy gradient algorithm, we train a single-hidden-layer neural network to balance a physically accurate simulation of a single inverted pendulum. The trained weights and biases can then be transferred to a physical agent, where…

Machine Learning · Computer Science 2021-02-17 Dylan Bates

Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and…

Optimization and Control · Mathematics 2022-10-11 Bin Hu , Kaiqing Zhang , Na Li , Mehran Mesbahi , Maryam Fazel , Tamer Başar

This paper develops the first policy gradient method with global optimality guarantee and complexity analysis for robust reinforcement learning under model mismatch. Robust reinforcement learning is to learn a policy robust to model…

Machine Learning · Computer Science 2022-05-17 Yue Wang , Shaofeng Zou

Reinforcement learning considers the problem of finding policies that maximize an expected cumulative reward in a Markov decision process with unknown transition probabilities. In this paper we consider the problem of finding optimal…

Machine Learning · Computer Science 2020-10-19 Santiago Paternain , Juan Andres Bazerque , Alejandro Ribeiro

In order to model risk aversion in reinforcement learning, an emerging line of research adapts familiar algorithms to optimize coherent risk functionals, a class that includes conditional value-at-risk (CVaR). Because optimizing the…

Machine Learning · Computer Science 2021-03-09 Audrey Huang , Liu Leqi , Zachary C. Lipton , Kamyar Azizzadenesheli

We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely…

Machine Learning · Computer Science 2021-05-04 Mohammani Zaki , Avi Mohan , Aditya Gopalan , Shie Mannor

Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently…

Machine Learning · Computer Science 2019-05-15 Andreas Doerr , Michael Volpp , Marc Toussaint , Sebastian Trimpe , Christian Daniel

We propose policy gradient algorithms which learn risk-sensitive policies in a reinforcement learning (RL) framework. Our proposed algorithms maximize the distortion risk measure (DRM) of the cumulative reward in an episodic Markov decision…

Machine Learning · Computer Science 2024-02-06 Nithia Vijayan , Prashanth L. A

Model-based policy optimization is a well-established framework for designing reliable and high-performance controllers across a wide range of control applications. Recently, this approach has been extended to model predictive control…

Systems and Control · Electrical Eng. & Systems 2026-04-15 Riccardo Zuliani , Efe C. Balta , John Lygeros

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world…

Machine Learning · Statistics 2017-11-15 Felix Berkenkamp , Matteo Turchetta , Angela P. Schoellig , Andreas Krause

Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces. However, little is known about even their most basic theoretical convergence properties,…

Machine Learning · Computer Science 2020-10-16 Alekh Agarwal , Sham M. Kakade , Jason D. Lee , Gaurav Mahajan

In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a…

Artificial Intelligence · Computer Science 2024-05-28 Shangding Gu , Bilgehan Sel , Yuhao Ding , Lu Wang , Qingwei Lin , Alois Knoll , Ming Jin

We study reinforcement learning in hybrid discrete-continuous action spaces, such as settings where the discrete component selects a regime (or index) and the continuous component optimizes within it -- a structure common in robotics,…

Machine Learning · Computer Science 2026-05-15 Matias Alvo , Daniel Russo , Yash Kanoria

We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of…

Machine Learning · Computer Science 2024-06-25 Nithia Vijayan , Prashanth L. A

Policy gradient methods have become a standard for training reinforcement learning agents in a scalable and efficient manner. However, they do not account for transition uncertainty, whereas learning robust policies can be computationally…

Machine Learning · Computer Science 2023-12-12 Navdeep Kumar , Esther Derman , Matthieu Geist , Kfir Levy , Shie Mannor

Inverse reinforcement learning (IRL) addresses the problem of recovering a task description given a demonstration of the optimal policy used to solve such a task. The optimal policy is usually provided by an expert or teacher, making IRL…

Machine Learning · Computer Science 2012-02-09 Héctor Ratia , Luis Montesano , Ruben Martinez-Cantin

Real-world control systems require policies that are not only high-performing but also interpretable and robust. A promising direction toward this goal is model-based control, which learns system dynamics and cost functions from historical…

Systems and Control · Electrical Eng. & Systems 2025-11-20 Yuexin Bian , Jie Feng , Yuanyuan Shi
‹ Prev 1 2 3 10 Next ›