Related papers: Sample-based Distributional Policy Gradient

Deterministic Policy Gradient for Reinforcement Learning with Continuous Time and State

The theory of continuous-time reinforcement learning (RL) has progressed rapidly in recent years. While the ultimate objective of RL is typically to learn deterministic control policies, most existing continuous-time RL methods rely on…

Machine Learning · Computer Science 2026-03-17 Ziheng Cheng , Xin Guo , Yufei Zhang

Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence

Risk-sensitive reinforcement learning (RL) is crucial for maintaining reliable performance in high-stakes applications. While traditional RL methods aim to learn a point estimate of the random cumulative cost, distributional RL (DRL) seeks…

Machine Learning · Computer Science 2025-02-03 Minheng Xiao , Xian Yu , Lei Ying

Distributed Distributional Deterministic Policy Gradients

This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we…

Machine Learning · Computer Science 2018-04-25 Gabriel Barth-Maron , Matthew W. Hoffman , David Budden , Will Dabney , Dan Horgan , Dhruva TB , Alistair Muldal , Nicolas Heess , Timothy Lillicrap

Data-regularized Reinforcement Learning for Diffusion Models at Scale

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Machine Learning · Computer Science 2025-12-25 Haotian Ye , Kaiwen Zheng , Jiashu Xu , Puheng Li , Huayu Chen , Jiaqi Han , Sheng Liu , Qinsheng Zhang , Hanzi Mao , Zekun Hao , Prithvijit Chattopadhyay , Dinghao Yang , Liang Feng , Maosheng Liao , Junjie Bai , Ming-Yu Liu , James Zou , Stefano Ermon

Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective

Deep Reinforcement Learning (DRL) suffers from uncertainties and inaccuracies in the observation signal in realworld applications. Adversarial attack is an effective method for evaluating the robustness of DRL agents. However, existing…

Machine Learning · Computer Science 2025-01-09 Tianyang Duan , Zongyuan Zhang , Zheng Lin , Yue Gao , Ling Xiong , Yong Cui , Hongbin Liang , Xianhao Chen , Heming Cui , Dong Huang

A Deep Reinforcement Learning Approach to Efficient Distributed Optimization

In distributed optimization, the practical problem-solving performance is essentially sensitive to algorithm selection, parameter setting, problem type and data pattern. Thus, it is often laborious to acquire a highly efficient method for a…

Optimization and Control · Mathematics 2024-01-04 Daokuan Zhu , Tianqi Xu , Jie Lu

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. We build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations and actual…

Artificial Intelligence · Computer Science 2018-10-09 Mel Vecerik , Todd Hester , Jonathan Scholz , Fumin Wang , Olivier Pietquin , Bilal Piot , Nicolas Heess , Thomas Rothörl , Thomas Lampe , Martin Riedmiller

Policy Evaluation in Distributional LQR

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL. At the…

Optimization and Control · Mathematics 2023-03-27 Zifan Wang , Yulong Gao , Siyi Wang , Michael M. Zavlanos , Alessandro Abate , Karl H. Johansson

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) has been shown to improve performance by modeling the value…

Machine Learning · Computer Science 2025-07-08 Ju-Seung Byun , Andrew Perrault

Breaking the Grid: Distance-Guided Reinforcement Learning in Large Discrete Action Spaces

Reinforcement Learning (RL) is increasingly applied to large-scale decision-making problems like logistics, scheduling, and recommender systems, but existing algorithms struggle with the curse of dimensionality in such large discrete action…

Machine Learning · Computer Science 2026-05-12 Heiko Hoppe , Fabian Akkerman , Wouter van Heeswijk , Maximilian Schiffer

A Risk-Sensitive Approach to Policy Optimization

Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy. This differs from human decision-making, where gains and losses are valued differently and…

Machine Learning · Computer Science 2023-11-17 Jared Markowitz , Ryan W. Gardner , Ashley Llorens , Raman Arora , I-Jeng Wang

Policy Evaluation in Distributional LQR (Extended Version)

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard…

Optimization and Control · Mathematics 2024-03-26 Zifan Wang , Yulong Gao , Siyi Wang , Michael M. Zavlanos , Alessandro Abate , Karl H. Johansson

Robustness and risk management via distributional dynamic programming

In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally…

Machine Learning · Computer Science 2022-01-03 Mastane Achab , Gergely Neu

A Differential Perspective on Distributional Reinforcement Learning

To date, distributional reinforcement learning (distributional RL) methods have exclusively focused on the discounted setting, where an agent aims to optimize a discounted sum of rewards over time. In this work, we extend distributional RL…

Machine Learning · Computer Science 2026-01-14 Juan Sebastian Rojas , Chi-Guhn Lee

Deep Reinforcement Learning for Resource Constrained Multiclass Scheduling in Wireless Networks

The problem of resource constrained scheduling in a dynamic and heterogeneous wireless setting is considered here. In our setup, the available limited bandwidth resources are allocated in order to serve randomly arriving service demands,…

Machine Learning · Computer Science 2022-04-01 Apostolos Avranas , Marios Kountouris , Philippe Ciblat

Distributionally Robust Self Paced Curriculum Reinforcement Learning

A central challenge in reinforcement learning is that policies trained in controlled environments often fail under distribution shifts at deployment into real-world environments. Distributionally Robust Reinforcement Learning (DRRL)…

Machine Learning · Computer Science 2026-03-10 Anirudh Satheesh , Keenan Powell , Vaneet Aggarwal

One-Step Distributional Reinforcement Learning

Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return. In the distributional RL (DistrRL) paradigm, the agent goes beyond the limit of the expected value, to…

Machine Learning · Computer Science 2023-05-01 Mastane Achab , Reda Alami , Yasser Abdelaziz Dahou Djilali , Kirill Fedyanin , Eric Moulines

A Rollout-Based Algorithm and Reward Function for Resource Allocation in Business Processes

Resource allocation plays a critical role in minimizing cycle time and improving the efficiency of business processes. Recently, Deep Reinforcement Learning (DRL) has emerged as a powerful technique to optimize resource allocation policies…

Machine Learning · Computer Science 2025-09-03 Jeroen Middelhuis , Zaharah Bukhsh , Ivo Adan , Remco Dijkman

Model Free Deep Deterministic Policy Gradient Controller for Setpoint Tracking of Non-minimum Phase Systems

Deep Reinforcement Learning (DRL) techniques have received significant attention in control and decision-making algorithms. Most applications involve complex decision-making systems, justified by the algorithms' computational power and…

Systems and Control · Electrical Eng. & Systems 2024-02-28 Fatemeh Tavakkoli , Pouria Sarhadi , Benoit Clement , Wasif Naeem

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with Base Controllers

Deep Reinforcement Learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. On the one hand, a sparse-reward setting…

Robotics · Computer Science 2021-12-07 Guangming Wang , Minjian Xin , Wenhua Wu , Zhe Liu , Hesheng Wang