Related papers: Gradient dynamics in reinforcement learning

Identifying Policy Gradient Subspaces

Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised…

Machine Learning · Computer Science 2024-03-19 Jan Schneider , Pierre Schumacher , Simon Guist , Le Chen , Daniel Häufle , Bernhard Schölkopf , Dieter Büchler

Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems

Reinforcement learning is a promising approach to learning robotics controllers. It has recently been shown that algorithms based on finite-difference estimates of the policy gradient are competitive with algorithms based on the policy…

Machine Learning · Computer Science 2021-10-12 Osbert Bastani

Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods

In this paper we propose a novel gradient algorithm to learn a policy from an expert's observed behavior assuming that the expert behaves optimally with respect to some unknown reward function of a Markovian Decision Problem. The…

Machine Learning · Computer Science 2012-06-26 Gergely Neu , Csaba Szepesvari

Distributed Learning in Non-Convex Environments -- Part I: Agreement at a Linear Rate

Driven by the need to solve increasingly complex optimization problems in signal processing and machine learning, there has been increasing interest in understanding the behavior of gradient-descent algorithms in non-convex environments.…

Optimization and Control · Mathematics 2019-07-04 Stefan Vlaski , Ali H. Sayed

Reinforcement Learning for Adaptive Routing

Reinforcement learning means learning a policy--a mapping of observations into actions--based on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with…

Machine Learning · Computer Science 2017-05-25 Leonid Peshkin , Virginia Savova

The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations

In recent years, various powerful policy gradient algorithms have been proposed in deep reinforcement learning. While all these algorithms build on the Policy Gradient Theorem, the specific design choices differ significantly across…

Machine Learning · Computer Science 2024-03-04 Matthias Lehmann

Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based Approach

This paper presents a constrained policy gradient algorithm. We introduce constraints for safe learning with the following steps. First, learning is slowed down (lazy learning) so that the episodic policy change can be computed with the…

Machine Learning · Computer Science 2022-01-24 Balázs Varga , Balázs Kulcsár , Morteza Haghir Chehreghani

Linear convergence of a policy gradient method for some finite horizon continuous time control problems

Despite its popularity in the reinforcement learning community, a provably convergent policy gradient method for continuous space-time control problems with nonlinear state dynamics has been elusive. This paper proposes proximal gradient…

Optimization and Control · Mathematics 2022-12-27 Christoph Reisinger , Wolfgang Stockinger , Yufei Zhang

Trajectory-Based Off-Policy Deep Reinforcement Learning

Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently…

Machine Learning · Computer Science 2019-05-15 Andreas Doerr , Michael Volpp , Marc Toussaint , Sebastian Trimpe , Christian Daniel

Hindsight policy gradients

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable…

Machine Learning · Computer Science 2019-02-21 Paulo Rauber , Avinash Ummadisingu , Filipe Mutz , Juergen Schmidhuber

Policy Gradients for Probabilistic Constrained Reinforcement Learning

This paper considers the problem of learning safe policies in the context of reinforcement learning (RL). In particular, we consider the notion of probabilistic safety. This is, we aim to design policies that maintain the state of the…

Machine Learning · Computer Science 2023-04-20 Weiqin Chen , Dharmashankar Subramanian , Santiago Paternain

Stochastic Reinforcement Learning with Stability Guarantees for Control of Unknown Nonlinear Systems

Designing a stabilizing controller for nonlinear systems is a challenging task, especially for high-dimensional problems with unknown dynamics. Traditional reinforcement learning algorithms applied to stabilization tasks tend to drive the…

Systems and Control · Electrical Eng. & Systems 2024-09-16 Thanin Quartz , Ruikun Zhou , Hans De Sterck , Jun Liu

A policy gradient approach for optimization of smooth risk measures

We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of…

Machine Learning · Computer Science 2024-06-25 Nithia Vijayan , Prashanth L. A

Performative Reinforcement Learning

We introduce the framework of performative reinforcement learning where the policy chosen by the learner affects the underlying reward and transition dynamics of the environment. Following the recent literature on performative…

Machine Learning · Computer Science 2023-06-08 Debmalya Mandal , Stelios Triantafyllou , Goran Radanovic

Leveraging Reward Gradients For Reinforcement Learning in Differentiable Physics Simulations

In recent years, fully differentiable rigid body physics simulators have been developed, which can be used to simulate a wide range of robotic systems. In the context of reinforcement learning for control, these simulators theoretically…

Machine Learning · Computer Science 2022-03-08 Sean Gillen , Katie Byl

On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift

Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces. However, little is known about even their most basic theoretical convergence properties,…

Machine Learning · Computer Science 2020-10-16 Alekh Agarwal , Sham M. Kakade , Jason D. Lee , Gaurav Mahajan

Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

This paper studies a distributed policy gradient in collaborative multi-agent reinforcement learning (MARL), where agents over a communication network aim to find the optimal policy to maximize the average of all agents' local returns. Due…

Multiagent Systems · Computer Science 2022-12-06 Xiaoxiao Zhao , Jinlong Lei , Li Li , Jie Chen

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively…

Machine Learning · Computer Science 2021-06-15 Dong-Ki Kim , Miao Liu , Matthew Riemer , Chuangchuang Sun , Marwa Abdulhai , Golnaz Habibi , Sebastian Lopez-Cot , Gerald Tesauro , Jonathan P. How

Policy Gradient for Continuing Tasks in Non-stationary Markov Decision Processes

Reinforcement learning considers the problem of finding policies that maximize an expected cumulative reward in a Markov decision process with unknown transition probabilities. In this paper we consider the problem of finding optimal…

Machine Learning · Computer Science 2020-10-19 Santiago Paternain , Juan Andres Bazerque , Alejandro Ribeiro

Reward-Conditioned Policies

Reinforcement learning offers the promise of automating the acquisition of complex behavioral skills. However, compared to commonly used and well-understood supervised learning methods, reinforcement learning algorithms can be brittle,…

Machine Learning · Computer Science 2020-01-01 Aviral Kumar , Xue Bin Peng , Sergey Levine