Related papers: Zeroth-order Deterministic Policy Gradient

Deterministic Value-Policy Gradients

Reinforcement learning algorithms such as the deep deterministic policy gradient algorithm (DDPG) has been widely used in continuous control tasks. However, the model-free DDPG algorithm suffers from high sample complexity. In this paper we…

Machine Learning · Computer Science 2019-11-14 Qingpeng Cai , Ling Pan , Pingzhong Tang

Soft Deterministic Policy Gradient with Gaussian Smoothing

Deterministic policy gradient (DPG) is widely utilized for continuous control; however, it inherently relies on the differentiability of the critic with respect to the action during policy updates. This assumption is violated in practical…

Machine Learning · Computer Science 2026-05-08 Hyunjun Na , Donghwan Lee

Deterministic Policy Gradients With General State Transitions

We study a reinforcement learning setting, where the state transition function is a convex combination of a stochastic continuous function and a deterministic function. Such a setting generalizes the widely-studied stochastic state…

Machine Learning · Computer Science 2018-10-03 Qingpeng Cai , Ling Pan , Pingzhong Tang

Deterministic Policy Gradient for Reinforcement Learning with Continuous Time and State

The theory of continuous-time reinforcement learning (RL) has progressed rapidly in recent years. While the ultimate objective of RL is typically to learn deterministic control policies, most existing continuous-time RL methods rely on…

Machine Learning · Computer Science 2026-03-17 Ziheng Cheng , Xin Guo , Yufei Zhang

Distributed Distributional Deterministic Policy Gradients

This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we…

Machine Learning · Computer Science 2018-04-25 Gabriel Barth-Maron , Matthew W. Hoffman , David Budden , Will Dabney , Dan Horgan , Dhruva TB , Alistair Muldal , Nicolas Heess , Timothy Lillicrap

Robust Deterministic Policy Gradient for Disturbance Attenuation and Its Application to Quadrotor Control

This paper presents a robust reinforcement learning algorithm called robust deterministic policy gradient (RDPG), which reformulates the H-infinity control problem as a two-player zero-sum dynamic game between a user and an adversary. The…

Robotics · Computer Science 2025-12-04 Taeho Lee , Donghwan Lee

Regularly Updated Deterministic Policy Gradient Algorithm

Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known reinforcement learning methods. However, this method is inefficient and unstable in practical applications. On the other hand, the bias and variance of the Q…

Machine Learning · Computer Science 2020-07-02 Shuai Han , Wenbo Zhou , Shuai Lü , Jiayu Yu

Learning to Explore with Meta-Policy Gradient

The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to…

Machine Learning · Computer Science 2018-03-28 Tianbing Xu , Qiang Liu , Liang Zhao , Jian Peng

Hierarchical Deep Deterministic Policy Gradient for Autonomous Maze Navigation of Mobile Robots

Maze navigation is a fundamental challenge in robotics, requiring agents to traverse complex environments efficiently. While the Deep Deterministic Policy Gradient (DDPG) algorithm excels in control tasks, its performance in maze navigation…

Robotics · Computer Science 2025-08-08 Wenjie Hu , Ye Zhou , Hann Woei Ho

Deterministic policy gradient based optimal control with probabilistic constraints

This paper studies a deep deterministic policy gradient (DDPG) based actor critic (AC) reinforcement learning (RL) technique to control a linear discrete-time system with a quadratic control cost while ensuring a constraint on the…

Systems and Control · Electrical Eng. & Systems 2023-12-22 Arunava Naha , Subhrakanti Dey

Compatible Gradient Approximations for Actor-Critic Algorithms

Deterministic policy gradient algorithms are foundational for actor-critic methods in controlling continuous systems, yet they often encounter inaccuracies due to their dependence on the derivative of the critic's value estimates with…

Machine Learning · Computer Science 2025-02-11 Baturay Saglam , Dionysis Kalogerias

Model Free Deep Deterministic Policy Gradient Controller for Setpoint Tracking of Non-minimum Phase Systems

Deep Reinforcement Learning (DRL) techniques have received significant attention in control and decision-making algorithms. Most applications involve complex decision-making systems, justified by the algorithms' computational power and…

Systems and Control · Electrical Eng. & Systems 2024-02-28 Fatemeh Tavakkoli , Pouria Sarhadi , Benoit Clement , Wasif Naeem

Computational Performance of Deep Reinforcement Learning to find Nash Equilibria

We test the performance of deep deterministic policy gradient (DDPG), a deep reinforcement learning algorithm, able to handle continuous state and action spaces, to learn Nash equilibria in a setting where firms compete in prices. These…

Computer Science and Game Theory · Computer Science 2025-09-30 Christoph Graf , Viktor Zobernig , Johannes Schmidt , Claude Klöckl

Data-efficient Deep Reinforcement Learning for Dexterous Manipulation

Deep learning and reinforcement learning methods have recently been used to solve a variety of problems in continuous control domains. An obvious application of these techniques is dexterous manipulation tasks in robotics which are…

Machine Learning · Computer Science 2017-04-12 Ivaylo Popov , Nicolas Heess , Timothy Lillicrap , Roland Hafner , Gabriel Barth-Maron , Matej Vecerik , Thomas Lampe , Yuval Tassa , Tom Erez , Martin Riedmiller

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

The average reward criterion is relatively less studied as most existing works in the Reinforcement Learning literature consider the discounted reward criterion. There are few recent works that present on-policy average reward actor-critic…

Machine Learning · Computer Science 2023-07-20 Naman Saxena , Subhojyoti Khastigir , Shishir Kolathaya , Shalabh Bhatnagar

Structure Matters: Dynamic Policy Gradient

In this work, we study $\gamma$-discounted infinite-horizon tabular Markov decision processes (MDPs) and introduce a framework called dynamic policy gradient (DynPG). The framework directly integrates dynamic programming with (any) policy…

Machine Learning · Computer Science 2024-11-08 Sara Klein , Xiangyuan Zhang , Tamer Başar , Simon Weissmann , Leif Döring

Policy Gradient-based Model Free Optimal LQG Control with a Probabilistic Risk Constraint

In this paper, we investigate a model-free optimal control design that minimizes an infinite horizon average expected quadratic cost of states and control actions subject to a probabilistic risk or chance constraint using input-output data.…

Systems and Control · Electrical Eng. & Systems 2024-11-11 Arunava Naha , Subhrakanti Dey

Edge Delayed Deep Deterministic Policy Gradient: efficient continuous control for edge scenarios

Deep Reinforcement Learning is gaining increasing attention thanks to its capability to learn complex policies in high-dimensional settings. Recent advancements utilize a dual-network architecture to learn optimal policies through the…

Machine Learning · Computer Science 2025-10-14 Alberto Sinigaglia , Niccolò Turcato , Ruggero Carli , Gian Antonio Susto

Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial

This paper provides the details of implementing two important policy gradient methods to solve the inverted pendulum problem. These are namely the Deep Deterministic Policy Gradient (DDPG) and the Proximal Policy Optimization (PPO)…

Machine Learning · Computer Science 2021-05-18 Swagat Kumar

Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs

We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing…

Artificial Intelligence · Computer Science 2025-04-07 Sergio Rozada , Dongsheng Ding , Antonio G. Marques , Alejandro Ribeiro