English
Related papers

Related papers: Diversity Actor-Critic: Sample-Aware Entropy Regul…

200 papers

We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art performance but are also simple and minimalistic. For standard continuous control benchmarks, Soft Actor-Critic (SAC), which employs entropy maximization,…

Machine Learning · Computer Science 2020-12-08 Che Wang , Yanqiu Wu , Quan Vuong , Keith Ross

Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the…

Machine Learning · Computer Science 2021-09-27 Chayan Banerjee , Zhiyong Chen , Nasimul Noman

Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm…

Machine Learning · Computer Science 2019-06-10 Patrick Nadeem Ward , Ariella Smofsky , Avishek Joey Bose

We propose a new policy iteration theory as an important extension of soft policy iteration and Soft Actor-Critic (SAC), one of the most efficient model free algorithms for deep reinforcement learning. Supported by the new theory, arbitrary…

Machine Learning · Computer Science 2019-02-18 Gang Chen , Yiming Peng

We present a novel extension to the family of Soft Actor-Critic (SAC) algorithms. We argue that based on the Maximum Entropy Principle, discrete SAC can be further improved via additional statistical constraints derived from a surrogate…

Machine Learning · Computer Science 2025-06-24 Dexter Neo , Tsuhan Chen

Model-free deep reinforcement learning has achieved great success in many domains, such as video games, recommendation systems and robotic control tasks. In continuous control tasks, widely used policies with Gaussian distributions results…

Machine Learning · Computer Science 2023-06-05 Lingwei Peng , Hui Qian , Zhebang Shen , Chao Zhang , Fei Li

Reinforcement Learning (RL) has shown great potential in complex control tasks, particularly when combined with deep neural networks within the Actor-Critic (AC) framework. However, in practical applications, balancing exploration, learning…

Robotics · Computer Science 2026-02-25 Zhiwei Shang , Xinyi Yuan , Wenjun Huang , Yunduan Cui , Di Chen , Meixin Zhu

This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several…

Machine Learning · Computer Science 2017-07-11 Ziyu Wang , Victor Bapst , Nicolas Heess , Volodymyr Mnih , Remi Munos , Koray Kavukcuoglu , Nando de Freitas

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from…

Entropy regularization is a popular method in reinforcement learning (RL). Although it has many advantages, it alters the RL objective of the original Markov Decision Process (MDP). Though divergence regularization has been proposed to…

Machine Learning · Computer Science 2022-06-22 Kefan Su , Zongqing Lu

Risk-aware Reinforcement Learning (RL) algorithms like SAC and TD3 were shown empirically to outperform their risk-neutral counterparts in a variety of continuous-action tasks. However, the theoretical basis for the pessimistic objectives…

Machine Learning · Computer Science 2024-05-27 Michal Nauman , Marek Cygan

Reinforcement learning (RL) has proven highly effective in addressing complex decision-making and control tasks. However, in most traditional RL algorithms, the policy is typically parameterized as a diagonal Gaussian distribution with…

Machine Learning · Computer Science 2024-12-24 Yinuo Wang , Likun Wang , Yuxuan Jiang , Wenjun Zou , Tong Liu , Xujie Song , Wenxuan Wang , Liming Xiao , Jiang Wu , Jingliang Duan , Shengbo Eben Li

Designing off-policy reinforcement learning algorithms is typically a very challenging task, because a desirable iteration update often involves an expectation over an on-policy distribution. Prior off-policy actor-critic (AC) algorithms…

Machine Learning · Computer Science 2021-07-20 Tengyu Xu , Zhuoran Yang , Zhaoran Wang , Yingbin Liang

Soft Actor-Critic (SAC) is one of the state-of-the-art off-policy reinforcement learning (RL) algorithms that is within the maximum entropy based RL framework. SAC is demonstrated to perform very well in a list of continous control tasks…

Machine Learning · Computer Science 2021-12-22 Zhenyang Shi , Surya P. N. Singh

Soft actor-critic (SAC) in reinforcement learning is expected to be one of the next-generation robot control schemes. Its ability to maximize policy entropy would make a robotic controller robust to noise and perturbation, which is useful…

Machine Learning · Computer Science 2023-07-04 Taisuke Kobayashi

Many policy gradient methods are variants of Actor-Critic (AC), where a value function (critic) is learned to facilitate updating the parameterized policy (actor). The update to the actor involves a log-likelihood update weighted by the…

Machine Learning · Computer Science 2023-03-02 Samuel Neumann , Sungsu Lim , Ajin Joseph , Yangchen Pan , Adam White , Martha White

Soft Actor-Critic (SAC) is an off-policy actor-critic deep reinforcement learning (DRL) algorithm based on maximum entropy reinforcement learning. By combining off-policy updates with an actor-critic formulation, SAC achieves…

Machine Learning · Computer Science 2019-06-11 Che Wang , Keith Ross

The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and natural actor-critic (NAC) algorithms has been…

Machine Learning · Computer Science 2021-02-15 Tengyu Xu , Zhe Wang , Yingbin Liang

Soft actor-critic (SAC) is a popular algorithm for max-entropy reinforcement learning. In practice, the energy-based policies in SAC are often approximated using simple policy classes for efficiency, sacrificing the expressiveness and…

Machine Learning · Computer Science 2026-01-01 Yuyang Zhang , Yang Hu , Bo Dai , Na Li

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

‹ Prev 1 2 3 10 Next ›