Related papers: Soft Actor-Critic With Integer Actions

Soft Actor-Critic with Beta Policy via Implicit Reparameterization Gradients

Recent advances in deep reinforcement learning have achieved impressive results in a wide range of complex tasks, but poor sample efficiency remains a major obstacle to real-world deployment. Soft actor-critic (SAC) mitigates this problem…

Machine Learning · Computer Science 2024-09-10 Luca Della Libera

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

Machine Learning · Computer Science 2019-09-16 Tuomas Haarnoja , Aurick Zhou , Kristian Hartikainen , George Tucker , Sehoon Ha , Jie Tan , Vikash Kumar , Henry Zhu , Abhishek Gupta , Pieter Abbeel , Sergey Levine

Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience

Soft Actor-Critic (SAC) is an off-policy actor-critic reinforcement learning algorithm, essentially based on entropy regularization. SAC trains a policy by maximizing the trade-off between expected return and entropy (randomness in the…

Machine Learning · Computer Science 2021-09-27 Chayan Banerjee , Zhiyong Chen , Nasimul Noman

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive optimal behavior that accounts for model uncertainty in dynamical systems. However, previous studies have shown that by considering the worst case scenario, robust policies can be overly…

Machine Learning · Computer Science 2018-10-25 Esther Derman , Daniel J. Mankowitz , Timothy A. Mann , Shie Mannor

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Although Reinforcement Learning (RL) is effective for sequential decision-making problems under uncertainty, it still fails to thrive in real-world systems where risk or safety is a binding constraint. In this paper, we formulate the RL…

Machine Learning · Computer Science 2022-07-07 Yannis Flet-Berliac , Debabrota Basu

Effective Reinforcement Learning Control using Conservative Soft Actor-Critic

Reinforcement Learning (RL) has shown great potential in complex control tasks, particularly when combined with deep neural networks within the Actor-Critic (AC) framework. However, in practical applications, balancing exploration, learning…

Robotics · Computer Science 2026-02-25 Zhiwei Shang , Xinyi Yuan , Wenjun Huang , Yunduan Cui , Di Chen , Meixin Zhu

Adversarial Skill Learning for Robust Manipulation

Deep reinforcement learning has made significant progress in robotic manipulation tasks and it works well in the ideal disturbance-free environment. However, in a real-world environment, both internal and external disturbances are…

Robotics · Computer Science 2020-11-09 Pingcheng Jian , Chao Yang , Di Guo , Huaping Liu , Fuchun Sun

Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint

Soft actor-critic (SAC) in reinforcement learning is expected to be one of the next-generation robot control schemes. Its ability to maximize policy entropy would make a robotic controller robust to noise and perturbation, which is useful…

Machine Learning · Computer Science 2023-07-04 Taisuke Kobayashi

Revisiting Discrete Soft Actor-Critic

We study the adaption of Soft Actor-Critic (SAC), which is considered as a state-of-the-art reinforcement learning (RL) algorithm, from continuous action space to discrete action space. We revisit vanilla discrete SAC and provide an…

Machine Learning · Computer Science 2024-11-21 Haibin Zhou , Tong Wei , Zichuan Lin , junyou li , Junliang Xing , Yuanchun Shi , Li Shen , Chao Yu , Deheng Ye

Soft Actor-Critic for Discrete Action Settings

Soft Actor-Critic is a state-of-the-art reinforcement learning algorithm for continuous action settings that is not applicable to discrete action settings. Many important settings involve discrete actions, however, and so here we derive an…

Machine Learning · Computer Science 2019-10-21 Petros Christodoulou

Generalizing soft actor-critic algorithms to discrete action spaces

ATARI is a suite of video games used by reinforcement learning (RL) researchers to test the effectiveness of the learning algorithm. Receiving only the raw pixels and the game score, the agent learns to develop sophisticated strategies,…

Machine Learning · Computer Science 2024-07-17 Le Zhang , Yong Gu , Xin Zhao , Yanshuo Zhang , Shu Zhao , Yifei Jin , Xinxin Wu

Distributional Reinforcement Learning via the Cram\'er Distance

This paper explores the application of the Soft Actor-Critic (SAC) algorithm within a Distributional Reinforcement Learning setting and introduces an implementation of such algorithm named Cram\'er-based Distributional Soft Actor-Critic…

Machine Learning · Computer Science 2026-05-12 Vanya Aziz , Ivo Nowak , E. M. T Hendrix

Soft Actor-Critic with Cross-Entropy Policy Optimization

Soft Actor-Critic (SAC) is one of the state-of-the-art off-policy reinforcement learning (RL) algorithms that is within the maximum entropy based RL framework. SAC is demonstrated to perform very well in a list of continous control tasks…

Machine Learning · Computer Science 2021-12-22 Zhenyang Shi , Surya P. N. Singh

PAC-Bayesian Soft Actor-Critic Learning

Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy evaluation and improvement via two separate function approximators. The practicality of this approach comes at the expense of training instability, caused…

Machine Learning · Computer Science 2024-06-11 Bahareh Tasdighi , Abdullah Akgül , Manuel Haussmann , Kenny Kazimirzak Brink , Melih Kandemir

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

Low-precision training has become a popular approach to reduce compute requirements, memory footprint, and energy consumption in supervised learning. In contrast, this promising approach has not yet enjoyed similarly widespread adoption…

Machine Learning · Computer Science 2021-06-07 Johan Bjorck , Xiangyu Chen , Christopher De Sa , Carla P. Gomes , Kilian Q. Weinberger

Reinforcement Learning Architectures: SAC, TAC, and ESAC

The trend is to implement intelligent agents capable of analyzing available information and utilize it efficiently. This work presents a number of reinforcement learning (RL) architectures; one of them is designed for intelligent agents.…

Machine Learning · Computer Science 2020-04-07 Ala'eddin Masadeh , Zhengdao Wang , Ahmed E. Kamal

Regularization of Soft Actor-Critic Algorithms with Automatic Temperature Adjustment

This work presents a comprehensive analysis to regularize the Soft Actor-Critic (SAC) algorithm with automatic temperature adjustment. The the policy evaluation, the policy improvement and the temperature adjustment are reformulated,…

Machine Learning · Computer Science 2023-05-24 Ben You

Discriminator Soft Actor Critic without Extrinsic Rewards

It is difficult to be able to imitate well in unknown states from a small amount of expert data and sampling data. Supervised learning methods such as Behavioral Cloning do not require sampling data, but usually suffer from distribution…

Machine Learning · Computer Science 2020-02-03 Daichi Nishio , Daiki Kuyoshi , Toi Tsuneda , Satoshi Yamane

Rethinking Soft Actor-Critic in High-Dimensional Action Spaces: The Cost of Ignoring Distribution Shift

Soft Actor-Critic algorithm is widely recognized for its robust performance across a range of deep reinforcement learning tasks, where it leverages the tanh transformation to constrain actions within bounded limits. However, this…

Machine Learning · Computer Science 2025-04-23 Yanjun Chen , Xinming Zhang , Xianghui Wang , Zhiqiang Xu , Xiaoyu Shen , Wei Zhang

SACn: Soft Actor-Critic with n-step Returns

Soft Actor-Critic (SAC) is widely used in practical applications and is now one of the most relevant off-policy online model-free reinforcement learning (RL) methods. The technique of n-step returns is known to increase the convergence…

Machine Learning · Computer Science 2025-12-16 Jakub Łyskawa , Jakub Lewandowski , Paweł Wawrzyński