Related papers: Soft-Robust Actor-Critic Policy-Gradient

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty

Deep reinforcement learning (RL) has achieved remarkable success, yet its deployment in real-world scenarios is often limited by vulnerability to environmental uncertainties. Distributionally robust RL (DR-RL) algorithms have been proposed…

Machine Learning · Computer Science 2026-04-21 Mingxuan Cui , Duo Zhou , Yuxuan Han , Grani A. Hanasusanto , Qiong Wang , Huan Zhang , Zhengyuan Zhou

Effective Reinforcement Learning Control using Conservative Soft Actor-Critic

Reinforcement Learning (RL) has shown great potential in complex control tasks, particularly when combined with deep neural networks within the Actor-Critic (AC) framework. However, in practical applications, balancing exploration, learning…

Robotics · Computer Science 2026-02-25 Zhiwei Shang , Xinyi Yuan , Wenjun Huang , Yunduan Cui , Di Chen , Meixin Zhu

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

Machine Learning · Computer Science 2019-09-16 Tuomas Haarnoja , Aurick Zhou , Kristian Hartikainen , George Tucker , Sehoon Ha , Jie Tan , Vikash Kumar , Henry Zhu , Abhishek Gupta , Pieter Abbeel , Sergey Levine

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation

We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment. Previous policy-based robust RL algorithms…

Machine Learning · Computer Science 2023-12-12 Ruida Zhou , Tao Liu , Min Cheng , Dileep Kalathil , P. R. Kumar , Chao Tian

PAC-Bayesian Soft Actor-Critic Learning

Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy evaluation and improvement via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused…

Machine Learning · Computer Science 2024-06-11 Bahareh Tasdighi , Abdullah Akgül , Manuel Haussmann , Kenny Kazimirzak Brink , Melih Kandemir

Adversarial Skill Learning for Robust Manipulation

Deep reinforcement learning has made significant progress in robotic manipulation tasks and it works well in the ideal disturbance-free environment. However, in a real-world environment, both internal and external disturbances are…

Robotics · Computer Science 2020-11-09 Pingcheng Jian , Chao Yang , Di Guo , Huaping Liu , Fuchun Sun

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Although Reinforcement Learning (RL) is effective for sequential decision-making problems under uncertainty, it still fails to thrive in real-world systems where risk or safety is a binding constraint. In this paper, we formulate the RL…

Machine Learning · Computer Science 2022-07-07 Yannis Flet-Berliac , Debabrota Basu

Safe Reinforcement Learning with Dual Robustness

Reinforcement learning (RL) agents are vulnerable to adversarial disturbances, which can deteriorate task performance or compromise safety specifications. Existing methods either address safety requirements under the assumption of no…

Machine Learning · Computer Science 2023-09-14 Zeyang Li , Chuxiong Hu , Yunan Wang , Yujie Yang , Shengbo Eben Li

Soft Actor-Critic with Cross-Entropy Policy Optimization

Soft Actor-Critic (SAC) is one of the state-of-the-art off-policy reinforcement learning (RL) algorithms that is within the maximum entropy based RL framework. SAC is demonstrated to perform very well in a list of continous control tasks…

Machine Learning · Computer Science 2021-12-22 Zhenyang Shi , Surya P. N. Singh

Distributionally Robust Reinforcement Learning

Real-world applications require RL algorithms to act safely. During learning process, it is likely that the agent executes sub-optimal actions that may lead to unsafe/poor states of the system. Exploration is particularly brittle in…

Machine Learning · Statistics 2019-06-17 Elena Smirnova , Elvis Dohmatob , Jérémie Mary

An Actor-Critic Method for Simulation-Based Optimization

We focus on a simulation-based optimization problem of choosing the best design from the feasible space. Although the simulation model can be queried with finite samples, its internal processing rule cannot be utilized in the optimization…

Machine Learning · Computer Science 2021-11-02 Kuo Li , Qing-Shan Jia , Jiaqi Yan

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and…

Machine Learning · Computer Science 2018-08-10 Tuomas Haarnoja , Aurick Zhou , Pieter Abbeel , Sergey Levine

Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts

We study the robustness of deep reinforcement learning algorithms against distribution shifts within contextual multi-stage stochastic combinatorial optimization problems from the operations research domain. In this context, risk-sensitive…

Machine Learning · Computer Science 2024-02-16 Tobias Enders , James Harrison , Maximilian Schiffer

Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience

Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the…

Machine Learning · Computer Science 2021-09-27 Chayan Banerjee , Zhiyong Chen , Nasimul Noman

Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which,…

Machine Learning · Computer Science 2023-07-21 Guanlin Liu , Zhihan Zhou , Han Liu , Lifeng Lai

Distributional Reinforcement Learning via the Cram\'er Distance

This paper explores the application of the Soft Actor-Critic (SAC) algorithm within a Distributional Reinforcement Learning setting and introduces an implementation of such algorithm named Cram\'er-based Distributional Soft Actor-Critic…

Machine Learning · Computer Science 2026-05-12 Vanya Aziz , Ivo Nowak , E. M. T Hendrix

Soft-Robust Algorithms for Batch Reinforcement Learning

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion, which minimizes the probability of a catastrophic failure. Unfortunately,…

Machine Learning · Computer Science 2021-03-01 Elita A. Lobo , Mohammad Ghavamzadeh , Marek Petrik

Development of a Soft Actor Critic Deep Reinforcement Learning Approach for Harnessing Energy Flexibility in a Large Office Building

This research is concerned with the novel application and investigation of `Soft Actor Critic' (SAC) based Deep Reinforcement Learning (DRL) to control the cooling setpoint (and hence cooling loads) of a large commercial building to harness…

Machine Learning · Computer Science 2021-07-08 Anjukan Kathirgamanathan , Eleni Mangina , Donal P. Finn

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft…

Machine Learning · Computer Science 2025-07-01 Xiaoteng Ma , Junyao Chen , Li Xia , Jun Yang , Qianchuan Zhao , Zhengyuan Zhou

Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies

Deep Reinforcement Learning (DRL) algorithms for continuous action spaces are known to be brittle toward hyperparameters as well as \cut{being}sample inefficient. Soft Actor Critic (SAC) proposes an off-policy deep actor critic algorithm…

Machine Learning · Computer Science 2019-06-10 Patrick Nadeem Ward , Ariella Smofsky , Avishek Joey Bose