Related papers: Optimizing Sensor Redundancy in Sequential Decisio…

Robust Reinforcement Learning Objectives for Sequential Recommender Systems

Attention-based sequential recommendation methods have shown promise in accurately capturing users' evolving interests from their past interactions. Recent research has also explored the integration of reinforcement learning (RL) into these…

Machine Learning · Computer Science 2024-04-19 Melissa Mozifian , Tristan Sylvain , Dave Evans , Lili Meng

Optimization Algorithm for Feedback and Feedforward Policies towards Robot Control Robust to Sensing Failures

Model-free or learning-based control, in particular, reinforcement learning (RL), is expected to be applied for complex robotic tasks. Traditional RL requires a policy to be optimized is state-dependent, that means, the policy is a kind of…

Machine Learning · Computer Science 2022-08-09 Taisuke Kobayashi , Kenta Yoshizawa

Safe reinforcement learning control for continuous-time nonlinear systems without a backup controller

This paper proposes an on-policy reinforcement learning (RL) control algorithm that solves the optimal regulation problem for a class of uncertain continuous-time nonlinear systems under user-defined state constraints. We formulate the safe…

Systems and Control · Electrical Eng. & Systems 2022-09-20 Soutrik Bandyopadhyay , Shubhendu Bhasin

R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning

In this work, we address the problem of determining reliable policies in reinforcement learning (RL), with a focus on optimization under uncertainty and the need for performance guarantees. While classical RL algorithms aim at maximizing…

Machine Learning · Computer Science 2025-10-22 Nadir Farhi

Observation Space Matters: Benchmark and Optimization Algorithm

Recent advances in deep reinforcement learning (deep RL) enable researchers to solve challenging control problems, from simulated environments to real-world robotic tasks. However, deep RL algorithms are known to be sensitive to the problem…

Robotics · Computer Science 2023-02-01 Joanne Taery Kim , Sehoon Ha

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications. Previous primal-dual style approaches suffer from instability issues and lack optimality…

Machine Learning · Computer Science 2022-06-20 Zuxin Liu , Zhepeng Cen , Vladislav Isenbaev , Wei Liu , Zhiwei Steven Wu , Bo Li , Ding Zhao

Off-Policy Reinforcement Learning with Loss Function Weighted by Temporal Difference Error

Training agents via off-policy deep reinforcement learning (RL) requires a large memory, named replay memory, that stores past experiences used for learning. These experiences are sampled, uniformly or non-uniformly, to create the batches…

Machine Learning · Computer Science 2022-12-27 Bumgeun Park , Taeyoung Kim , Woohyeon Moon , Luiz Felipe Vecchietti , Dongsoo Har

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Although safety stock optimisation has been studied for more than 60 years, most companies still use simplistic means to calculate necessary safety stock levels, partly due to the mismatch between existing analytical methods' emphases on…

Multiagent Systems · Computer Science 2021-07-05 Edward Elson Kosasih , Alexandra Brintrup

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration. We propose Recovery RL, an algorithm…

Machine Learning · Computer Science 2021-05-19 Brijen Thananjeyan , Ashwin Balakrishna , Suraj Nair , Michael Luo , Krishnan Srinivasan , Minho Hwang , Joseph E. Gonzalez , Julian Ibarz , Chelsea Finn , Ken Goldberg

Cautious Reinforcement Learning with Logical Constraints

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal,…

Machine Learning · Computer Science 2020-03-24 Mohammadhosein Hasanbeig , Alessandro Abate , Daniel Kroening

Dynamic Optimization of Storage Systems Using Reinforcement Learning Techniques

The exponential growth of data-intensive applications has placed unprecedented demands on modern storage systems, necessitating dynamic and efficient optimization strategies. Traditional heuristics employed for storage performance…

Operating Systems · Computer Science 2025-08-25 Chiyu Cheng , Chang Zhou , Yang Zhao

Residual Off-Policy RL for Finetuning Behavior Cloning Policies

Recent advances in behavior cloning (BC) have enabled impressive visuomotor control policies. However, these approaches are limited by the quality of human demonstrations, the manual effort required for data collection, and the diminishing…

Robotics · Computer Science 2025-09-29 Lars Ankile , Zhenyu Jiang , Rocky Duan , Guanya Shi , Pieter Abbeel , Anusha Nagabandi

On Reward-Balancing Methods for Reinforcement Learning

This paper investigates the so-called reward-balancing methods, a novel class of algorithms for solving discounted-return reinforcement learning (RL) problems. These methods consist of iteratively adjusting the reward function to transform…

Optimization and Control · Mathematics 2026-04-23 Simone Baroncini , Bahman Gharesifard , Giuseppe Notarstefano

Reinforcement Learning Based Robust Policy Design for Relay and Power Optimization in DF Relaying Networks

In this paper, we study the outage minimization problem in a decode-and-forward cooperative network with relay uncertainty. To reduce the outage probability and improve the quality of service, existing researches usually rely on the…

Information Theory · Computer Science 2022-05-19 Yuanzhe Geng , Erwu Liu , Rui Wang , Pengcheng Sun , Binyu Lu

Learning to reset in target search problems

Target search problems are central to a wide range of fields, from biological foraging to the optimization algorithms. Recently, the ability to reset the search has been shown to significantly improve the searcher's efficiency. However, the…

Statistical Mechanics · Physics 2025-03-17 Gorka Muñoz-Gil , Hans J. Briegel , Michele Caraglio

Iterative Reachability Estimation for Safe Reinforcement Learning

Ensuring safety is important for the practical deployment of reinforcement learning (RL). Various challenges must be addressed, such as handling stochasticity in the environments, providing rigorous guarantees of persistent state-wise…

Machine Learning · Computer Science 2023-09-26 Milan Ganai , Zheng Gong , Chenning Yu , Sylvia Herbert , Sicun Gao

RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Modern deep architectures often rely on large-scale datasets, but training on these datasets incurs high computational and storage overhead. Real-world datasets often contain substantial redundancies, prompting the need for more…

Machine Learning · Computer Science 2025-06-27 Suorong Yang , Peijia Li , Furao Shen , Jian Zhao

Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control

This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the…

Systems and Control · Electrical Eng. & Systems 2023-12-07 Leilei Cui , Tamer Başar , Zhong-Ping Jiang

Offline Inverse Reinforcement Learning

The objective of offline RL is to learn optimal policies when a fixed exploratory demonstrations data-set is available and sampling additional observations is impossible (typically if this operation is either costly or rises ethical…

Machine Learning · Computer Science 2021-06-10 Firas Jarboui , Vianney Perchet

The Fallacy of Minimizing Cumulative Regret in the Sequential Task Setting

Online Reinforcement Learning (RL) is typically framed as the process of minimizing cumulative regret (CR) through interactions with an unknown environment. However, real-world RL applications usually involve a sequence of tasks, and the…

Machine Learning · Statistics 2024-10-28 Ziping Xu , Kelly W. Zhang , Susan A. Murphy