English
Related papers

Related papers: A Self-Tuning Actor-Critic Algorithm

200 papers

Reinforcement learning (RL) has had many successes in both "deep" and "shallow" settings. In both cases, significant hyperparameter tuning is often required to achieve good performance. Furthermore, when nonlinear function approximation is…

Machine Learning · Computer Science 2019-05-27 Kenny Young , Baoxiang Wang , Matthew E. Taylor

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit representation of all action values to estimate the gradient…

High-precision control tasks present substantial challenges for reinforcement learning (RL) algorithms, frequently resulting in suboptimal performance attributed to network approximation inaccuracies and inadequate sample quality.These…

Machine Learning · Computer Science 2025-02-05 Donghe Chen , Yubin Peng , Tengjie Zheng , Han Wang , Chaoran Qu , Lin Cheng

Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks. It is observed…

Machine Learning · Computer Science 2017-06-01 Chang Xu , Tao Qin , Gang Wang , Tie-Yan Liu

Off-Policy Actor-Critic (Off-PAC) methods have proven successful in a variety of continuous control tasks. Normally, the critic's action-value function is updated using temporal-difference, and the critic in turn provides a loss for the…

Machine Learning · Computer Science 2020-11-03 Wei Zhou , Yiying Li , Yongxin Yang , Huaimin Wang , Timothy M. Hospedales

Actor-critic methods solve reinforcement learning problems by updating a parameterized policy known as an actor in a direction that increases an estimate of the expected return known as a critic. However, existing actor-critic methods only…

Machine Learning · Statistics 2018-02-23 Voot Tangkaratt , Abbas Abdolmaleki , Masashi Sugiyama

Policy gradient methods in actor-critic reinforcement learning (RL) have become perhaps the most promising approaches to solving continuous optimal control problems. However, the trial-and-error nature of RL and the inherent randomness…

Machine Learning · Computer Science 2024-04-19 Ruofan Wu , Junmin Zhong , Jennie Si

Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these high-dimensional observation spaces present a number of challenges in practice, since the policy must…

Machine Learning · Computer Science 2020-10-27 Alex X. Lee , Anusha Nagabandi , Pieter Abbeel , Sergey Levine

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

In this work, we propose Behavior-Guided Actor-Critic (BAC), an off-policy actor-critic deep RL algorithm. BAC mathematically formulates the behavior of the policy through autoencoders by providing an accurate estimation of how frequently…

Machine Learning · Computer Science 2021-04-12 Ammar Fayad , Majd Ibrahim

Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a range of challenging decision making and control tasks. One dominant component of recent deep reinforcement learning algorithms is the target network…

Machine Learning · Computer Science 2020-11-12 Lin Shao , Yifan You , Mengyuan Yan , Qingyun Sun , Jeannette Bohg

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an…

Machine Learning · Computer Science 2020-07-17 Zhongwen Xu , Hado van Hasselt , Matteo Hessel , Junhyuk Oh , Satinder Singh , David Silver

Off-policy actor-critic methods in reinforcement learning train a critic with temporal-difference updates and use it as a learning signal for the policy (actor). This design typically achieves higher sample efficiency than purely on-policy…

Machine Learning · Computer Science 2026-01-05 Uğurcan Özalp

Training intelligent agents through reinforcement learning is a notoriously unstable procedure. Massive parallelization on GPUs and distributed systems has been exploited to generate a large amount of training experiences and consequently…

Machine Learning · Computer Science 2019-02-08 Greg Heinrich , Iuri Frosio

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the…

Machine Learning · Computer Science 2023-11-01 Sharan Vaswani , Amirreza Kazemi , Reza Babanezhad , Nicolas Le Roux

This paper introduces a novel reinforcement learning (RL) strategy designed to facilitate rapid autonomy transfer by utilizing pre-trained critic value functions from multiple environments. Unlike traditional methods that require extensive…

Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the…

Machine Learning · Computer Science 2021-09-27 Chayan Banerjee , Zhiyong Chen , Nasimul Noman

Dynamic Algorithm Configuration (DAC) addresses the challenge of dynamically setting hyperparameters of an algorithm for a diverse set of instances rather than focusing solely on individual tasks. Agents trained with Deep Reinforcement…

Machine Learning · Computer Science 2024-07-19 Carolin Benjamins , Gjorgjina Cenikj , Ana Nikolikj , Aditya Mohan , Tome Eftimov , Marius Lindauer

We present the first class of policy-gradient algorithms that work with both state-value and policy function-approximation, and are guaranteed to converge under off-policy training. Our solution targets problems in reinforcement learning…

Artificial Intelligence · Computer Science 2018-02-23 Hamid Reza Maei

Safe Reinforcement Learning (Safe RL) is one of the prevalently studied subcategories of trial-and-error-based methods with the intention to be deployed on real-world systems. In safe RL, the goal is to maximize reward performance while…

Machine Learning · Computer Science 2024-08-16 Homayoun Honari , Amir Mehdi Soufi Enayati , Mehran Ghafarian Tamizi , Homayoun Najjaran
‹ Prev 1 2 3 10 Next ›