Related papers: FlowPG: Action-constrained Policy Gradient with No…

Leveraging Constraint Violation Signals For Action-Constrained Reinforcement Learning

In many RL applications, ensuring an agent's actions adhere to constraints is crucial for safety. Most previous methods in Action-Constrained Reinforcement Learning (ACRL) employ a projection layer after the policy network to correct the…

Machine Learning · Computer Science 2025-02-18 Janaka Chathuranga Brahmanage , Jiajing Ling , Akshat Kumar

Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization

Action-constrained reinforcement learning (RL) is a widely-used approach in various real-world applications, such as scheduling in networked systems with resource constraints and control of a robot with kinematic constraints. While the…

Machine Learning · Computer Science 2021-08-03 Jyun-Li Lin , Wei Hung , Shang-Hsuan Yang , Ping-Chun Hsieh , Xi Liu

Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs

Action-constrained reinforcement learning (ACRL) is a generic framework for learning control policies with zero action constraint violation, which is required by various safety-critical and resource-constrained applications. The existing…

Machine Learning · Computer Science 2025-03-18 Wei Hung , Shao-Hua Sun , Ping-Chun Hsieh

Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning

Constrained Reinforcement Learning (CRL) tackles sequential decision-making problems where agents are required to achieve goals by maximizing the expected return while meeting domain-specific constraints, which are often formulated as…

Machine Learning · Computer Science 2024-11-13 Alessandro Montenegro , Marco Mussi , Matteo Papini , Alberto Maria Metelli

Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes

Constrained Reinforcement Learning (CRL) addresses sequential decision-making problems where agents are required to achieve goals by maximizing the expected return while meeting domain-specific constraints. In this setting, policy-based…

Machine Learning · Computer Science 2025-06-09 Alessandro Montenegro , Leonardo Cesani , Marco Mussi , Matteo Papini , Alberto Maria Metelli

Constrained Reinforcement Learning for Predictive Control in Real-Time Stochastic Dynamic Optimal Power Flow

Deep Reinforcement Learning (DRL) has become a popular method for solving control problems in power systems. Conventional DRL encourages the agent to explore various policies encoded in a neural network (NN) with the goal of maximizing the…

Systems and Control · Electrical Eng. & Systems 2024-10-28 Tong Wu , Anna Scaglione , Daniel Arnold

Constrained Reinforcement Learning via Dissipative Saddle Flow Dynamics

In constrained reinforcement learning (C-RL), an agent seeks to learn from the environment a policy that maximizes the expected cumulative reward while satisfying minimum requirements in secondary cumulative reward constraints. Several…

Machine Learning · Computer Science 2022-12-06 Tianqi Zheng , Pengcheng You , Enrique Mallada

Sample Efficient Active Algorithms for Offline Reinforcement Learning

Offline reinforcement learning (RL) enables policy learning from static data but often suffers from poor coverage of the state-action space and distributional shift problems. This problem can be addressed by allowing limited online…

Machine Learning · Computer Science 2026-02-03 Soumyadeep Roy , Shashwat Kushwaha , Ambedkar Dukkipati

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security…

Machine Learning · Computer Science 2023-11-28 Changyu Chen , Ramesha Karunasena , Thanh Hong Nguyen , Arunesh Sinha , Pradeep Varakantham

Drag-reduction strategies in wall-bounded turbulent flows using deep reinforcement learning

In this work we compare different drag-reduction strategies that compute their actuation based on the fluctuations at a given wall-normal location in turbulent open channel flow. In order to perform this study, we implement and describe in…

Fluid Dynamics · Physics 2023-09-07 L. Guastoni , J. Rabault , H. Azizpour , R. Vinuesa

Accelerating Deep Reinforcement Learning strategies of Flow Control through a multi-environment approach

Deep Reinforcement Learning (DRL) has recently been proposed as a methodology to discover complex Active Flow Control (AFC) strategies [Rabault, J., Kuchta, M., Jensen, A., Reglade, U., & Cerardi, N. (2019): "Artificial neural networks…

Computational Physics · Physics 2019-10-23 Jean Rabault , Alexander Kuhnle

Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning

Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves multiple tasks at the same time. This paper presents a constrained formulation for multi-task RL where the goal is to maximize the average…

Optimization and Control · Mathematics 2024-05-07 Sihan Zeng , Thinh T. Doan , Justin Romberg

Flow-Based Policy for Online Reinforcement Learning

We present \textbf{FlowRL}, a novel framework for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. We argue that in addition to training signals, enhancing the…

Machine Learning · Computer Science 2025-06-17 Lei Lv , Yunfei Li , Yu Luo , Fuchun Sun , Tao Kong , Jiafeng Xu , Xiao Ma

Deep Reinforcement Learning for Equal Risk Pricing and Hedging under Dynamic Expectile Risk Measures

Recently equal risk pricing, a framework for fair derivative pricing, was extended to consider dynamic risk measures. However, all current implementations either employ a static risk measure that violates time consistency, or are based on…

Pricing of Securities · Quantitative Finance 2021-09-10 Saeed Marzban , Erick Delage , Jonathan Yumeng Li

Improving Stochastic Action-Constrained Reinforcement Learning via Truncated Distributions

In reinforcement learning (RL), it is often advantageous to consider additional constraints on the action space to ensure safety or action relevance. Existing work on such action-constrained RL faces challenges regarding effective policy…

Machine Learning · Computer Science 2025-12-01 Roland Stolz , Michael Eichelbeck , Matthias Althoff

Action Mapping for Reinforcement Learning in Continuous Environments with Constraints

Deep reinforcement learning (DRL) has had success across various domains, but applying it to environments with constraints remains challenging due to poor sample efficiency and slow convergence. Recent literature explored incorporating…

Machine Learning · Computer Science 2024-12-06 Mirco Theile , Lukas Dirnberger , Raphael Trumpp , Marco Caccamo , Alberto L. Sangiovanni-Vincentelli

Breaking the Grid: Distance-Guided Reinforcement Learning in Large Discrete Action Spaces

Reinforcement Learning (RL) is increasingly applied to large-scale decision-making problems like logistics, scheduling, and recommender systems, but existing algorithms struggle with the curse of dimensionality in such large discrete action…

Machine Learning · Computer Science 2026-05-12 Heiko Hoppe , Fabian Akkerman , Wouter van Heeswijk , Maximilian Schiffer

Learning Vehicle Routing Problems using Policy Optimisation

Deep reinforcement learning (DRL) has been used to learn effective heuristics for solving complex combinatorial optimisation problem via policy networks and have demonstrated promising performance. Existing works have focused on solving…

Machine Learning · Computer Science 2020-12-25 Nasrin Sultana , Jeffrey Chan , A. K. Qin , Tabinda Sarwar

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

In recent years, Reinforcement Learning (RL) has been applied to real-world problems with increasing success. Such applications often require to put constraints on the agent's behavior. Existing algorithms for constrained RL (CRL) rely on…

Machine Learning · Computer Science 2023-03-07 Ted Moskovitz , Brendan O'Donoghue , Vivek Veeriah , Sebastian Flennerhag , Satinder Singh , Tom Zahavy

Reinforcement Learning via Value Gradient Flow

We study behavior-regularized reinforcement learning (RL), where regularization toward a reference distribution (the dataset in offline RL or the base model in LLM RL finetuning) is essential to prevent value over-optimization caused by…

Machine Learning · Computer Science 2026-04-17 Haoran Xu , Kaiwen Hu , Somayeh Sojoudi , Amy Zhang