English
Related papers

Related papers: Sample-based Distributional Policy Gradient

200 papers

The theory of continuous-time reinforcement learning (RL) has progressed rapidly in recent years. While the ultimate objective of RL is typically to learn deterministic control policies, most existing continuous-time RL methods rely on…

Machine Learning · Computer Science 2026-03-17 Ziheng Cheng , Xin Guo , Yufei Zhang

Risk-sensitive reinforcement learning (RL) is crucial for maintaining reliable performance in high-stakes applications. While traditional RL methods aim to learn a point estimate of the random cumulative cost, distributional RL (DRL) seeks…

Machine Learning · Computer Science 2025-02-03 Minheng Xiao , Xian Yu , Lei Ying

This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we…

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Deep Reinforcement Learning (DRL) suffers from uncertainties and inaccuracies in the observation signal in realworld applications. Adversarial attack is an effective method for evaluating the robustness of DRL agents. However, existing…

Machine Learning · Computer Science 2025-01-09 Tianyang Duan , Zongyuan Zhang , Zheng Lin , Yue Gao , Ling Xiong , Yong Cui , Hongbin Liang , Xianhao Chen , Heming Cui , Dong Huang

In distributed optimization, the practical problem-solving performance is essentially sensitive to algorithm selection, parameter setting, problem type and data pattern. Thus, it is often laborious to acquire a highly efficient method for a…

Optimization and Control · Mathematics 2024-01-04 Daokuan Zhu , Tianqi Xu , Jie Lu

We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. We build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations and actual…

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL. At the…

Optimization and Control · Mathematics 2023-03-27 Zifan Wang , Yulong Gao , Siyi Wang , Michael M. Zavlanos , Alessandro Abate , Karl H. Johansson

Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) has been shown to improve performance by modeling the value…

Machine Learning · Computer Science 2025-07-08 Ju-Seung Byun , Andrew Perrault

Reinforcement Learning (RL) is increasingly applied to large-scale decision-making problems like logistics, scheduling, and recommender systems, but existing algorithms struggle with the curse of dimensionality in such large discrete action…

Machine Learning · Computer Science 2026-05-12 Heiko Hoppe , Fabian Akkerman , Wouter van Heeswijk , Maximilian Schiffer

Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy. This differs from human decision-making, where gains and losses are valued differently and…

Machine Learning · Computer Science 2023-11-17 Jared Markowitz , Ryan W. Gardner , Ashley Llorens , Raman Arora , I-Jeng Wang

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard…

Optimization and Control · Mathematics 2024-03-26 Zifan Wang , Yulong Gao , Siyi Wang , Michael M. Zavlanos , Alessandro Abate , Karl H. Johansson

In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally…

Machine Learning · Computer Science 2022-01-03 Mastane Achab , Gergely Neu

To date, distributional reinforcement learning (distributional RL) methods have exclusively focused on the discounted setting, where an agent aims to optimize a discounted sum of rewards over time. In this work, we extend distributional RL…

Machine Learning · Computer Science 2026-01-14 Juan Sebastian Rojas , Chi-Guhn Lee

The problem of resource constrained scheduling in a dynamic and heterogeneous wireless setting is considered here. In our setup, the available limited bandwidth resources are allocated in order to serve randomly arriving service demands,…

Machine Learning · Computer Science 2022-04-01 Apostolos Avranas , Marios Kountouris , Philippe Ciblat

A central challenge in reinforcement learning is that policies trained in controlled environments often fail under distribution shifts at deployment into real-world environments. Distributionally Robust Reinforcement Learning (DRRL)…

Machine Learning · Computer Science 2026-03-10 Anirudh Satheesh , Keenan Powell , Vaneet Aggarwal

Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return. In the distributional RL (DistrRL) paradigm, the agent goes beyond the limit of the expected value, to…

Machine Learning · Computer Science 2023-05-01 Mastane Achab , Reda Alami , Yasser Abdelaziz Dahou Djilali , Kirill Fedyanin , Eric Moulines

Resource allocation plays a critical role in minimizing cycle time and improving the efficiency of business processes. Recently, Deep Reinforcement Learning (DRL) has emerged as a powerful technique to optimize resource allocation policies…

Machine Learning · Computer Science 2025-09-03 Jeroen Middelhuis , Zaharah Bukhsh , Ivo Adan , Remco Dijkman

Deep Reinforcement Learning (DRL) techniques have received significant attention in control and decision-making algorithms. Most applications involve complex decision-making systems, justified by the algorithms' computational power and…

Systems and Control · Electrical Eng. & Systems 2024-02-28 Fatemeh Tavakkoli , Pouria Sarhadi , Benoit Clement , Wasif Naeem

Deep Reinforcement Learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. On the one hand, a sparse-reward setting…

Robotics · Computer Science 2021-12-07 Guangming Wang , Minjian Xin , Wenhua Wu , Zhe Liu , Hesheng Wang
‹ Prev 1 2 3 10 Next ›