Related papers: Risk-Sensitive Exponential Actor Critic

Risk-Sensitive Reinforcement Learning with Exponential Criteria

While reinforcement learning has shown experimental success in a number of applications, it is known to be sensitive to noise and perturbations in the parameters of the system, leading to high variance in the total reward amongst different…

Systems and Control · Electrical Eng. & Systems 2024-12-02 Erfaun Noorani , Christos Mavridis , John Baras

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

Machine Learning · Computer Science 2019-09-16 Tuomas Haarnoja , Aurick Zhou , Kristian Hartikainen , George Tucker , Sehoon Ha , Jie Tan , Vikash Kumar , Henry Zhu , Abhishek Gupta , Pieter Abbeel , Sergey Levine

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and…

Machine Learning · Computer Science 2018-08-10 Tuomas Haarnoja , Aurick Zhou , Pieter Abbeel , Sergey Levine

Balancing Value Underestimation and Overestimation with Realistic Actor-Critic

Model-free deep reinforcement learning (RL) has been successfully applied to challenging continuous control domains. However, poor sample efficiency prevents these methods from being widely used in real-world domains. This paper introduces…

Machine Learning · Computer Science 2022-10-27 Sicen Li , Qinyun Tang , Yiming Pang , Xinmeng Ma , Gang Wang

Entropic Risk Constrained Soft-Robust Policy Optimization

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential…

Machine Learning · Computer Science 2020-06-23 Reazul Hasan Russel , Bahram Behzadian , Marek Petrik

Reinforcement Learning with Elastic Time Steps

Traditional Reinforcement Learning (RL) policies are typically implemented with fixed control rates, often disregarding the impact of control rate selection. This can lead to inefficiencies as the optimal control rate varies with task…

Robotics · Computer Science 2024-08-13 Dong Wang , Giovanni Beltrame

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft…

Machine Learning · Computer Science 2025-07-01 Xiaoteng Ma , Junyao Chen , Li Xia , Jun Yang , Qianchuan Zhao , Zhengyuan Zhou

Actor-Critic Algorithm for Dynamic Expectile and CVaR

Optimizing dynamic risk with stochastic policies is challenging in both policy updates and value learning. The former typically requires transition perturbation, while the latter may rely on model-based approaches. To address these…

Machine Learning · Computer Science 2026-05-11 Yudong Luo , Erick Delage

Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors

In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which…

Machine Learning · Computer Science 2021-06-14 Jingliang Duan , Yang Guan , Shengbo Eben Li , Yangang Ren , Bo Cheng

Off-Policy Actor-Critic with Sigmoid-Bounded Entropy for Real-World Robot Learning

Deploying reinforcement learning in the real world remains challenging due to sample inefficiency, sparse rewards, and noisy visual observations. Prior work leverages demonstrations and human feedback to improve learning efficiency and…

Artificial Intelligence · Computer Science 2026-01-23 Xiefeng Wu , Mingyu Hu , Shu Zhang

S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic

Learning expressive stochastic policies instead of deterministic ones has been proposed to achieve better stability, sample complexity, and robustness. Notably, in Maximum Entropy Reinforcement Learning (MaxEnt RL), the policy is modeled as…

Machine Learning · Computer Science 2024-05-03 Safa Messaoud , Billel Mokeddem , Zhenghai Xue , Linsey Pang , Bo An , Haipeng Chen , Sanjay Chawla

Risk-sensitive Actor-Critic with Static Spectral Risk Measures for Online and Offline Reinforcement Learning

The development of Distributional Reinforcement Learning (DRL) has introduced a natural way to incorporate risk sensitivity into value-based and actor-critic methods by employing risk measures other than expectation in the value function.…

Machine Learning · Computer Science 2025-07-08 Mehrdad Moghimi , Hyejin Ku

RAMAC: Multimodal Risk-Aware Offline Reinforcement Learning and the Role of Behavior Regularization

In safety-critical domains where online data collection is infeasible, offline reinforcement learning (RL) offers an attractive alternative but only if policies deliver high returns without incurring catastrophic lower-tail risk. Prior work…

Machine Learning · Computer Science 2025-12-09 Kai Fukazawa , Kunal Mundada , Iman Soltani

Model-Based Actor-Critic with Chance Constraint for Stochastic System

Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low…

Machine Learning · Computer Science 2021-03-17 Baiyu Peng , Yao Mu , Yang Guan , Shengbo Eben Li , Yuming Yin , Jianyu Chen

On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics

Risk-aware Reinforcement Learning (RL) algorithms like SAC and TD3 were shown empirically to outperform their risk-neutral counterparts in a variety of continuous-action tasks. However, the theoretical basis for the pessimistic objectives…

Machine Learning · Computer Science 2024-05-27 Michal Nauman , Marek Cygan

Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past

Soft Actor-Critic (SAC) is an off-policy actor-critic deep reinforcement learning (DRL) algorithm based on maximum entropy reinforcement learning. By combining off-policy updates with an actor-critic formulation, SAC achieves…

Machine Learning · Computer Science 2019-06-11 Che Wang , Keith Ross

OPAC: Opportunistic Actor-Critic

Actor-critic methods, a type of model-free reinforcement learning (RL), have achieved state-of-the-art performances in many real-world domains in continuous control. Despite their success, the wide-scale deployment of these models is still…

Machine Learning · Computer Science 2020-12-14 Srinjoy Roy , Saptam Bakshi , Tamal Maharaj

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Although Reinforcement Learning (RL) is effective for sequential decision-making problems under uncertainty, it still fails to thrive in real-world systems where risk or safety is a binding constraint. In this paper, we formulate the RL…

Machine Learning · Computer Science 2022-07-07 Yannis Flet-Berliac , Debabrota Basu

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation

We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment. Previous policy-based robust RL algorithms…

Machine Learning · Computer Science 2023-12-12 Ruida Zhou , Tao Liu , Min Cheng , Dileep Kalathil , P. R. Kumar , Chao Tian

Maximum Mutation Reinforcement Learning for Scalable Control

Advances in Reinforcement Learning (RL) have demonstrated data efficiency and optimal control over large state spaces at the cost of scalable performance. Genetic methods, on the other hand, provide scalability but depict hyperparameter…

Machine Learning · Computer Science 2021-01-19 Karush Suri , Xiao Qi Shi , Konstantinos N. Plataniotis , Yuri A. Lawryshyn