Related papers: A Strategy-Oriented Bayesian Soft Actor-Critic Mod…

Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning

Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic environments to improve the system's utility, decrease the overall cost, and increase…

Artificial Intelligence · Computer Science 2023-12-06 Qin Yang , Ramviyas Parasuraman

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft…

Machine Learning · Computer Science 2025-07-01 Xiaoteng Ma , Junyao Chen , Li Xia , Jun Yang , Qianchuan Zhao , Zhengyuan Zhou

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

Machine Learning · Computer Science 2019-09-16 Tuomas Haarnoja , Aurick Zhou , Kristian Hartikainen , George Tucker , Sehoon Ha , Jie Tan , Vikash Kumar , Henry Zhu , Abhishek Gupta , Pieter Abbeel , Sergey Levine

Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors

In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which…

Machine Learning · Computer Science 2021-06-14 Jingliang Duan , Yang Guan , Shengbo Eben Li , Yangang Ren , Bo Cheng

Soft Actor-Critic with Cross-Entropy Policy Optimization

Soft Actor-Critic (SAC) is one of the state-of-the-art off-policy reinforcement learning (RL) algorithms that is within the maximum entropy based RL framework. SAC is demonstrated to perform very well in a list of continous control tasks…

Machine Learning · Computer Science 2021-12-22 Zhenyang Shi , Surya P. N. Singh

PAC-Bayesian Soft Actor-Critic Learning

Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy evaluation and improvement via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused…

Machine Learning · Computer Science 2024-06-11 Bahareh Tasdighi , Abdullah Akgül , Manuel Haussmann , Kenny Kazimirzak Brink , Melih Kandemir

Dissecting Discrete Soft Actor-Critic: Limitations and Principled Alternatives

While Soft Actor-Critic (SAC) is highly effective in continuous control, its discrete counterpart (DSAC) performs poorly on challenging discrete-action domains such as Atari. Consequently, starting from DSAC, we revisit the design of…

Machine Learning · Computer Science 2026-05-13 Reza Asad , Reza Babanezhad , Sharan Vaswani

Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience

Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the…

Machine Learning · Computer Science 2021-09-27 Chayan Banerjee , Zhiyong Chen , Nasimul Noman

Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL

Discrete reinforcement learning (RL) algorithms have demonstrated exceptional performance in solving sequential decision tasks with discrete action spaces, such as Atari games. However, their effectiveness is hindered when applied to…

Machine Learning · Computer Science 2023-08-22 Yechen Zhang , Jian Sun , Gang Wang , Zhuo Li , Wei Chen

Distributional Reinforcement Learning via the Cram\'er Distance

This paper explores the application of the Soft Actor-Critic (SAC) algorithm within a Distributional Reinforcement Learning setting and introduces an implementation of such algorithm named Cram\'er-based Distributional Soft Actor-Critic…

Machine Learning · Computer Science 2026-05-12 Vanya Aziz , Ivo Nowak , E. M. T Hendrix

Developing cooperative policies for multi-stage tasks

This paper proposes the Cooperative Soft Actor Critic (CSAC) method of enabling consecutive reinforcement learning agents to cooperatively solve a long time horizon multi-stage task. This method is achieved by modifying the policy of each…

Machine Learning · Computer Science 2020-07-02 Jordan Erskine , Chris Lehnert

Distributional Soft Actor-Critic with Diffusion Policy

Reinforcement learning has been proven to be highly effective in handling complex control tasks. Traditional methods typically use unimodal distributions, such as Gaussian distributions, to model the output of value distributions. However,…

Machine Learning · Computer Science 2025-07-14 Tong Liu , Yinuo Wang , Xujie Song , Wenjun Zou , Liangfa Chen , Likun Wang , Bin Shuai , Jingliang Duan , Shengbo Eben Li

TASAC: a twin-actor reinforcement learning framework with stochastic policy for batch process control

Due to their complex nonlinear dynamics and batch-to-batch variability, batch processes pose a challenge for process control. Due to the absence of accurate models and resulting plant-model mismatch, these problems become harder to address…

Machine Learning · Computer Science 2022-05-03 Tanuja Joshi , Hariprasad Kodamana , Harikumar Kandath , Niket Kaisare

Decision-Making under On-Ramp merge Scenarios by Distributional Soft Actor-Critic Algorithm

Merging into the highway from the on-ramp is an essential scenario for automated driving. The decision-making under the scenario needs to balance the safety and efficiency performance to optimize a long-term objective, which is challenging…

Robotics · Computer Science 2021-03-09 Yiting Kong , Yang Guan , Jingliang Duan , Shengbo Eben Li , Qi Sun , Bingbing Nie

Revisiting Discrete Soft Actor-Critic

We study the adaption of Soft Actor-Critic (SAC), which is considered as a state-of-the-art reinforcement learning (RL) algorithm, from continuous action space to discrete action space. We revisit vanilla discrete SAC and provide an…

Machine Learning · Computer Science 2024-11-21 Haibin Zhou , Tong Wei , Zichuan Lin , junyou li , Junliang Xing , Yuanchun Shi , Li Shen , Chao Yu , Deheng Ye

Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies

Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm…

Machine Learning · Computer Science 2019-06-10 Patrick Nadeem Ward , Ariella Smofsky , Avishek Joey Bose

Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning

We propose a new policy iteration theory as an important extension of soft policy iteration and Soft Actor-Critic (SAC), one of the most efficient model free algorithms for deep reinforcement learning. Supported by the new theory, arbitrary…

Machine Learning · Computer Science 2019-02-18 Gang Chen , Yiming Peng

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive optimal behavior that accounts for model uncertainty in dynamical systems. However, previous studies have shown that by considering the worst case scenario, robust policies can be overly…

Machine Learning · Computer Science 2018-10-25 Esther Derman , Daniel J. Mankowitz , Timothy A. Mann , Shie Mannor

Wasserstein Barycenter Soft Actor-Critic

Deep off-policy actor-critic algorithms have emerged as the leading framework for reinforcement learning in continuous control domains. However, most of these algorithms suffer from poor sample efficiency, especially in environments with…

Machine Learning · Computer Science 2026-02-25 Zahra Shahrooei , Ali Baheri

Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past

Soft Actor-Critic (SAC) is an off-policy actor-critic deep reinforcement learning (DRL) algorithm based on maximum entropy reinforcement learning. By combining off-policy updates with an actor-critic formulation, SAC achieves…

Machine Learning · Computer Science 2019-06-11 Che Wang , Keith Ross