English
Related papers

Related papers: Proximal Deterministic Policy Gradient

200 papers

Offline Reinforcement Learning (RL) aims to learn a near-optimal policy from a fixed dataset of transitions collected by another policy. This problem has attracted a lot of attention recently, but most existing methods with strong…

Machine Learning · Computer Science 2023-05-23 Germano Gabbianelli , Gergely Neu , Nneka Okolo , Matteo Papini

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data. Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms take the approach of constraining or regularizing…

Machine Learning · Computer Science 2021-12-06 Scott Fujimoto , Shixiang Shane Gu

Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing…

Machine Learning · Statistics 2023-01-06 Chengchun Shi , Zhengling Qi , Jianing Wang , Fan Zhou

We present an off-policy actor-critic algorithm for Reinforcement Learning (RL) that combines ideas from gradient-free optimization via stochastic search with learned action-value function. The result is a simple procedure consisting of…

This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity. First, we show that simple Deterministic Policy Gradient works remarkably well as…

Machine Learning · Computer Science 2020-06-30 Rasool Fakoor , Pratik Chaudhari , Alexander J. Smola

Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently…

Machine Learning · Computer Science 2019-05-15 Andreas Doerr , Michael Volpp , Marc Toussaint , Sebastian Trimpe , Christian Daniel

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor-critic methods perform poorly due to the overestimation of out-of-distribution state-action pairs. Thus, various additional augmentations are…

Machine Learning · Computer Science 2023-02-23 Zifeng Zhuang , Kun Lei , Jinxin Liu , Donglin Wang , Yilang Guo

Many reinforcement learning algorithms, particularly those that rely on return estimates for policy improvement, can suffer from poor sample efficiency and training instability due to high-variance return estimates. In this paper we…

Machine Learning · Computer Science 2026-01-06 Alexander W. Goodall , Edwin Hamel-De le Court , Francesco Belardinelli

We study offline reinforcement learning (RL) which seeks to learn a good policy based on a fixed, pre-collected dataset. A fundamental challenge behind this task is the distributional shift due to the dataset lacking sufficient exploration,…

Machine Learning · Computer Science 2023-10-11 Wenzhuo Zhou

We study the problem of off-policy value evaluation in reinforcement learning (RL), where one aims to estimate the value of a new policy based on data collected by a different policy. This problem is often a critical step when applying RL…

Machine Learning · Computer Science 2016-05-27 Nan Jiang , Lihong Li

We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies. In…

Machine Learning · Computer Science 2019-12-16 Aurélien F. Bibaut , Ivana Malenica , Nikos Vlassis , Mark J. van der Laan

Learning optimal behavior from existing data is one of the most important problems in Reinforcement Learning (RL). This is known as "off-policy control" in RL where an agent's objective is to compute an optimal policy based on the data…

Machine Learning · Computer Science 2022-06-16 Raghuram Bharadwaj Diddigi , Prateek Jain , Prabuchandran K. J. , Shalabh Bhatnagar

Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while…

Machine Learning · Computer Science 2023-07-25 Jiachen Li , Edwin Zhang , Ming Yin , Qinxun Bai , Yu-Xiang Wang , William Yang Wang

Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-critic approach involving off-policy evaluation. In this paper we show that simply doing one step of constrained/regularized policy improvement using…

Machine Learning · Computer Science 2021-12-06 David Brandfonbrener , William F. Whitney , Rajesh Ranganath , Joan Bruna

Primal-dual safe RL methods commonly perform iterations between the primal update of the policy and the dual update of the Lagrange Multiplier. Such a training paradigm is highly susceptible to the error in cumulative cost estimation since…

Machine Learning · Computer Science 2024-04-16 Zifan Wu , Bo Tang , Qian Lin , Chao Yu , Shangqin Mao , Qianlong Xie , Xingxing Wang , Dong Wang

In this work, we present a new model-free and off-policy reinforcement learning (RL) algorithm, that is capable of finding a near-optimal policy with state-action observations from arbitrary behavior policies. Our algorithm, called the…

Optimization and Control · Mathematics 2025-07-21 Narim Jeong , Donghwan Lee , Niao He

In recent years, reinforcement learning (RL) has gained increasing attention in control engineering. Especially, policy gradient methods are widely used. In this work, we improve the tracking performance of proximal policy optimization…

Machine Learning · Computer Science 2021-07-21 Jana Mayer , Johannes Westermann , Juan Pedro Gutiérrez H. Muriedas , Uwe Mettin , Alexander Lampe

Deep reinforcement learning algorithms require large amounts of experience to learn an individual task. While in principle meta-reinforcement learning (meta-RL) algorithms enable agents to learn new skills from small amounts of experience,…

Machine Learning · Computer Science 2019-03-21 Kate Rakelly , Aurick Zhou , Deirdre Quillen , Chelsea Finn , Sergey Levine

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we…

Machine Learning · Computer Science 2023-03-31 Yicheng Luo , Jackie Kay , Edward Grefenstette , Marc Peter Deisenroth
‹ Prev 1 2 3 10 Next ›