English
Related papers

Related papers: Guide Actor-Critic for Continuous Control

200 papers

We identify a fundamental problem in policy gradient-based methods in continuous control. As policy gradient methods require the agent's underlying probability distribution, they limit policy representation to parametric distribution…

Machine Learning · Computer Science 2019-11-26 Chen Tessler , Guy Tennenholtz , Shie Mannor

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the…

Machine Learning · Computer Science 2023-11-01 Sharan Vaswani , Amirreza Kazemi , Reza Babanezhad , Nicolas Le Roux

Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck. These…

Machine Learning · Computer Science 2021-02-09 Yannis Flet-Berliac , Johan Ferret , Olivier Pietquin , Philippe Preux , Matthieu Geist

Policy gradient methods in actor-critic reinforcement learning (RL) have become perhaps the most promising approaches to solving continuous optimal control problems. However, the trial-and-error nature of RL and the inherent randomness…

Machine Learning · Computer Science 2024-04-19 Ruofan Wu , Junmin Zhong , Jennie Si

Actor-critic (AC) methods are ubiquitous in reinforcement learning. Although it is understood that AC methods are closely related to policy gradient (PG), their precise connection has not been fully characterized previously. In this paper,…

Artificial Intelligence · Computer Science 2021-06-15 Junfeng Wen , Saurabh Kumar , Ramki Gummadi , Dale Schuurmans

Conventional Reinforcement Learning (RL) algorithms, typically focused on estimating or maximizing expected returns, face challenges when refining offline pretrained models with online experiences. This paper introduces Generative Actor…

Machine Learning · Computer Science 2025-12-29 Aoyang Qin , Deqian Kong , Wei Wang , Ying Nian Wu , Song-Chun Zhu , Sirui Xie

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit representation of all action values to estimate the gradient…

Model-free deep reinforcement learning has achieved great success in many domains, such as video games, recommendation systems and robotic control tasks. In continuous control tasks, widely used policies with Gaussian distributions results…

Machine Learning · Computer Science 2023-06-05 Lingwei Peng , Hui Qian , Zhebang Shen , Chao Zhang , Fei Li

High-precision control tasks present substantial challenges for reinforcement learning (RL) algorithms, frequently resulting in suboptimal performance attributed to network approximation inaccuracies and inadequate sample quality.These…

Machine Learning · Computer Science 2025-02-05 Donghe Chen , Yubin Peng , Tengjie Zheng , Han Wang , Chaoran Qu , Lin Cheng

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps…

Machine Learning · Computer Science 2023-01-31 Harshat Kumar , Alec Koppel , Alejandro Ribeiro

In this paper, we consider the problem of actor-critic reinforcement learning. Firstly, we extend the actor-critic architecture to actor-critic-N architecture by introducing more critics beyond rewards. Secondly, we combine the reward-based…

Machine Learning · Computer Science 2020-06-15 Weiya Ren

In this paper, we propose a second-order deterministic actor-critic framework in reinforcement learning that extends the classical deterministic policy gradient method to exploit curvature information of the performance function. Building…

Machine Learning · Computer Science 2025-11-13 Arash Bahari Kordabad , Dean Brandner , Sebastien Gros , Sergio Lucia , Sadegh Soudjani

This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems. Inspired by generative adversarial…

Computation and Language · Computer Science 2018-02-09 Baolin Peng , Xiujun Li , Jianfeng Gao , Jingjing Liu , Yun-Nung Chen , Kam-Fai Wong

Deterministic-policy actor-critic algorithms for continuous control improve the actor by plugging its actions into the critic and ascending the action-value gradient, which is obtained by chaining the actor's Jacobian matrix with the…

Artificial Intelligence · Computer Science 2020-10-23 Pierluca D'Oro , Wojciech Jaśkowski

In this work, we propose Behavior-Guided Actor-Critic (BAC), an off-policy actor-critic deep RL algorithm. BAC mathematically formulates the behavior of the policy through autoencoders by providing an accurate estimation of how frequently…

Machine Learning · Computer Science 2021-04-12 Ammar Fayad , Majd Ibrahim

Off-Policy Actor-Critic (Off-PAC) methods have proven successful in a variety of continuous control tasks. Normally, the critic's action-value function is updated using temporal-difference, and the critic in turn provides a loss for the…

Machine Learning · Computer Science 2020-11-03 Wei Zhou , Yiying Li , Yongxin Yang , Huaimin Wang , Timothy M. Hospedales

We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform…

Machine Learning · Computer Science 2022-07-20 Mohammadi Zaki , Avinash Mohan , Aditya Gopalan , Shie Mannor

In this work, we consider policy-based methods for solving the reinforcement learning problem, and establish the sample complexity guarantees. A policy-based algorithm typically consists of an actor and a critic. We consider using various…

Machine Learning · Computer Science 2023-01-16 Zaiwei Chen , Siva Theja Maguluri

Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability. This is partly due to the interaction between the actor and critic during learning, e.g., an…

Machine Learning · Computer Science 2019-02-26 Simone Parisi , Voot Tangkaratt , Jan Peters , Mohammad Emtiyaz Khan

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between…

Machine Learning · Computer Science 2018-01-01 Bo Dai , Albert Shaw , Niao He , Lihong Li , Le Song
‹ Prev 1 2 3 10 Next ›