Related papers: Analysis of a Target-Based Actor-Critic Algorithm …

A Convergent Online Single Time Scale Actor Critic Algorithm

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their generality, good convergence properties, and possible biological…

Machine Learning · Computer Science 2009-09-17 D. Di Castro , R. Meir

Finite-time analysis of single-timescale actor-critic

Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing…

Machine Learning · Computer Science 2024-01-29 Xuyang Chen , Lin Zhao

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the…

Machine Learning · Computer Science 2023-11-01 Sharan Vaswani , Amirreza Kazemi , Reza Babanezhad , Nicolas Le Roux

Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework

In this paper, we propose actor-director-critic, a new framework for deep reinforcement learning. Compared with the actor-critic framework, the director role is added, and action classification and action evaluation are applied…

Machine Learning · Computer Science 2023-01-11 Zongwei Liu , Yonghong Song , Yuanlin Zhang

A Theoretical Justification for Asymmetric Actor-Critic Algorithms

In reinforcement learning for partially observable environments, many successful algorithms have been developed within the asymmetric learning paradigm. This paradigm leverages additional state information available at training time for…

Machine Learning · Computer Science 2025-09-09 Gaspard Lambrechts , Damien Ernst , Aditya Mahajan

A Finite Time Analysis of Two Time-Scale Actor Critic Methods

Actor-critic (AC) methods have exhibited great empirical success compared with other reinforcement learning algorithms, where the actor uses the policy gradient to improve the learning policy and the critic uses temporal difference learning…

Machine Learning · Computer Science 2022-10-11 Yue Wu , Weitong Zhang , Pan Xu , Quanquan Gu

Two-Timescale Critic-Actor for Average Reward MDPs with Function Approximation

Several recent works have focused on carrying out non-asymptotic convergence analyses for AC algorithms. Recently, a two-timescale critic-actor algorithm has been presented for the discounted cost setting in the look-up table case where the…

Machine Learning · Computer Science 2025-09-01 Prashansa Panda , Shalabh Bhatnagar

Actor-Critic or Critic-Actor? A Tale of Two Time Scales

We revisit the standard formulation of tabular actor-critic algorithm as a two time-scale stochastic approximation with value function computed on a faster time-scale and policy computed on a slower time-scale. This emulates policy…

Machine Learning · Computer Science 2024-06-21 Shalabh Bhatnagar , Vivek S. Borkar , Soumyajit Guin

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Actor-critic methods for decentralized multi-agent reinforcement learning (MARL) facilitate collaborative optimal decision making without centralized coordination, thus enabling a wide range of applications in practice. To date, however,…

Machine Learning · Computer Science 2025-08-14 Zhiyao Zhang , Myeung Suk Oh , FNU Hairi , Ziyue Luo , Alvaro Velasquez , Jia Liu

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy

We study the global convergence and global optimality of actor-critic, one of the most popular families of reinforcement learning algorithms. While most existing works on actor-critic employ bi-level or two-timescale updates, we focus on…

Machine Learning · Computer Science 2021-06-15 Zuyue Fu , Zhuoran Yang , Zhaoran Wang

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

Actor-critic style two-time-scale algorithms are one of the most popular methods in reinforcement learning, and have seen great empirical success. However, their performance is not completely understood theoretically. In this paper, we…

Machine Learning · Computer Science 2022-02-22 Sajad Khodadadian , Thinh T. Doan , Justin Romberg , Siva Theja Maguluri

Double Actor-Critic with TD Error-Driven Regularization in Reinforcement Learning

To obtain better value estimation in reinforcement learning, we propose a novel algorithm based on the double actor-critic framework with temporal difference error-driven regularization, abbreviated as TDDR. TDDR employs double actors, with…

Machine Learning · Computer Science 2024-10-01 Haohui Chen , Zhiyong Chen , Aoxiang Liu , Wentuo Fang

Actor-Attention-Critic for Multi-Agent Reinforcement Learning

Reinforcement learning in multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in single-agent settings. We present an actor-critic algorithm that trains decentralized policies in…

Machine Learning · Computer Science 2019-05-29 Shariq Iqbal , Fei Sha

Cooperative Actor-Critic via TD Error Aggregation

In decentralized cooperative multi-agent reinforcement learning, agents can aggregate information from one another to learn policies that maximize a team-average objective function. Despite the willingness to cooperate with others, the…

Systems and Control · Electrical Eng. & Systems 2022-07-27 Martin Figura , Yixuan Lin , Ji Liu , Vijay Gupta

Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms

As an important type of reinforcement learning algorithms, actor-critic (AC) and natural actor-critic (NAC) algorithms are often executed in two ways for finding optimal policies. In the first nested-loop design, actor's one update of…

Machine Learning · Computer Science 2020-05-11 Tengyu Xu , Zhe Wang , Yingbin Liang

A Communication-Efficient Decentralized Actor-Critic Algorithm

In this paper, we study the problem of reinforcement learning in multi-agent systems where communication among agents is limited. We develop a decentralized actor-critic learning framework in which each agent performs several local updates…

Machine Learning · Computer Science 2025-10-23 Xiaoxing Ren , Nicola Bastianello , Thomas Parisini , Andreas A. Malikopoulos

Finite-time Convergence Analysis of Actor-Critic with Evolving Reward

Many popular practical reinforcement learning (RL) algorithms employ evolving reward functions-through techniques such as reward shaping, entropy regularization, or curriculum learning-yet their theoretical foundations remain…

Machine Learning · Computer Science 2025-10-15 Rui Hu , Yu Chen , Longbo Huang

Target-Based Temporal Difference Learning

The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side. In this work, we introduce a new family of target-based temporal…

Machine Learning · Computer Science 2019-09-24 Donghwan Lee , Niao He

Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic

Decentralized Actor-Critic (AC) algorithms have been widely utilized for multi-agent reinforcement learning (MARL) and have achieved remarkable success. Apart from its empirical success, the theoretical convergence property of decentralized…

Machine Learning · Computer Science 2023-01-31 Qijun Luo , Xiao Li

Finite Time Analysis of Constrained Natural Critic-Actor Algorithm with Improved Sample Complexity

Recent studies have increasingly focused on non-asymptotic convergence analyses for actor-critic (AC) algorithms. One such effort introduced a two-timescale critic-actor algorithm for the discounted cost setting using a tabular…

Machine Learning · Computer Science 2025-10-07 Prashansa Panda , Shalabh Bhatnagar