Related papers: Zeroth-Order Actor-Critic: An Evolutionary Framewo…

Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics

In many robotic applications, some aspects of the system dynamics can be modeled accurately while others are difficult to obtain or model. We present a novel reinforcement learning (RL) method for continuous state and action spaces that…

Artificial Intelligence · Computer Science 2017-06-06 Tomoki Nishi , Prashant Doshi , Michael R. James , Danil Prokhorov

OPAC: Opportunistic Actor-Critic

Actor-critic methods, a type of model-free reinforcement learning (RL), have achieved state-of-the-art performances in many real-world domains in continuous control. Despite their success, the wide-scale deployment of these models is still…

Machine Learning · Computer Science 2020-12-14 Srinjoy Roy , Saptam Bakshi , Tamal Maharaj

Actor-Critic Reinforcement Learning with Phased Actor

Policy gradient methods in actor-critic reinforcement learning (RL) have become perhaps the most promising approaches to solving continuous optimal control problems. However, the trial-and-error nature of RL and the inherent randomness…

Machine Learning · Computer Science 2024-04-19 Ruofan Wu , Junmin Zhong , Jennie Si

Learning Sampling Policy for Faster Derivative Free Optimization

Zeroth-order (ZO, also known as derivative-free) methods, which estimate the gradient only by two function evaluations, have attracted much attention recently because of its broad applications in machine learning community. The two function…

Machine Learning · Computer Science 2021-04-12 Zhou Zhai , Bin Gu , Heng Huang

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample…

Machine Learning · Computer Science 2019-09-16 Tuomas Haarnoja , Aurick Zhou , Kristian Hartikainen , George Tucker , Sehoon Ha , Jie Tan , Vikash Kumar , Henry Zhu , Abhishek Gupta , Pieter Abbeel , Sergey Levine

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these high-dimensional observation spaces present a number of challenges in practice, since the policy must…

Machine Learning · Computer Science 2020-10-27 Alex X. Lee , Anusha Nagabandi , Pieter Abbeel , Sergey Levine

Maximum Mutation Reinforcement Learning for Scalable Control

Advances in Reinforcement Learning (RL) have demonstrated data efficiency and optimal control over large state spaces at the cost of scalable performance. Genetic methods, on the other hand, provide scalability but depict hyperparameter…

Machine Learning · Computer Science 2021-01-19 Karush Suri , Xiao Qi Shi , Konstantinos N. Plataniotis , Yuri A. Lawryshyn

Zeroth-Order Optimization is Secretly Single-Step Policy Optimization

Zeroth-Order Optimization (ZOO) provides powerful tools for optimizing functions where explicit gradients are unavailable or expensive to compute. However, the underlying mechanisms of popular ZOO methods, particularly those employing…

Machine Learning · Computer Science 2025-06-18 Junbin Qiu , Zhengpeng Xie , Xiangda Yan , Yongjie Yang , Yao Shu

Risk-Sensitive Exponential Actor Critic

Model-free deep reinforcement learning (RL) algorithms have achieved tremendous success on a range of challenging tasks. However, safety concerns remain when these methods are deployed on real-world applications, necessitating risk-aware…

Machine Learning · Computer Science 2026-02-10 Alonso Granados , Jason Pacheco

On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps…

Machine Learning · Computer Science 2023-01-31 Harshat Kumar , Alec Koppel , Alejandro Ribeiro

TASAC: a twin-actor reinforcement learning framework with stochastic policy for batch process control

Due to their complex nonlinear dynamics and batch-to-batch variability, batch processes pose a challenge for process control. Due to the absence of accurate models and resulting plant-model mismatch, these problems become harder to address…

Machine Learning · Computer Science 2022-05-03 Tanuja Joshi , Hariprasad Kodamana , Harikumar Kandath , Niket Kaisare

A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence

Deep Actor-Critic algorithms, which combine Actor-Critic with deep neural network (DNN), have been among the most prevalent reinforcement learning algorithms for decision-making problems in simulated environments. However, the existing deep…

Machine Learning · Computer Science 2024-09-19 Kexuan Wang , An Liu , Baishuo Lin

Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL

Discrete reinforcement learning (RL) algorithms have demonstrated exceptional performance in solving sequential decision tasks with discrete action spaces, such as Atari games. However, their effectiveness is hindered when applied to…

Machine Learning · Computer Science 2023-08-22 Yechen Zhang , Jian Sun , Gang Wang , Zhuo Li , Wei Chen

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the…

Machine Learning · Computer Science 2023-11-01 Sharan Vaswani , Amirreza Kazemi , Reza Babanezhad , Nicolas Le Roux

Evolution-Guided Policy Gradient in Reinforcement Learning

Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack…

Machine Learning · Computer Science 2018-10-30 Shauharda Khadka , Kagan Tumer

Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning

Offline reinforcement learning (RL) is a promising approach for many control applications but faces challenges such as limited data coverage and value function overestimation. In this paper, we propose an implicit actor-critic (iAC)…

Machine Learning · Computer Science 2024-08-29 Vanshaj Khattar , Ming Jin

An Actor-Critic Method for Simulation-Based Optimization

We focus on a simulation-based optimization problem of choosing the best design from the feasible space. Although the simulation model can be queried with finite samples, its internal processing rule cannot be utilized in the optimization…

Machine Learning · Computer Science 2021-11-02 Kuo Li , Qing-Shan Jia , Jiaqi Yan

Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning

We present a mathematical framework and computational methods to optimally design a finite number of sequential experiments. We formulate this sequential optimal experimental design (sOED) problem as a finite-horizon partially observable…

Machine Learning · Computer Science 2024-03-28 Wanggang Shen , Xun Huan

Breaking the Computational Barrier: Provably Efficient Actor-Critic for Low-Rank MDPs

Reinforcement learning (RL) is a fundamental framework for sequential decision-making, in which an agent learns an optimal policy through interactions with an unknown environment. In settings with function approximation, many existing RL…

Machine Learning · Computer Science 2026-05-05 Ruiquan Huang , Donghao Li , Yingbin Liang , Jing Yang

Automated Parking Trajectory Generation Using Deep Reinforcement Learning

Autonomous parking is a key technology in modern autonomous driving systems, requiring high precision, strong adaptability, and efficiency in complex environments. This paper proposes a Deep Reinforcement Learning (DRL) framework based on…

Robotics · Computer Science 2025-05-01 Zheyu Zhang , Yutong Luo , Yongzhou Chen , Haopeng Zhao , Zhichao Ma , Hao Liu