Related papers: Safe Reinforcement Learning with Linear Function A…

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

We study the constrained reinforcement learning problem, in which an agent aims to maximize the expected cumulative reward subject to a constraint on the expected total value of a utility function. In contrast to existing model-based…

Machine Learning · Computer Science 2023-01-10 Arnob Ghosh , Xingyu Zhou , Ness Shroff

Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

We study the reinforcement learning (RL) problem in a constrained Markov decision process (CMDP), where an agent explores the environment to maximize the expected cumulative reward while satisfying a single constraint on the expected total…

Machine Learning · Computer Science 2026-01-29 Toshinori Kitamura , Arnob Ghosh , Tadashi Kozuno , Wataru Kumagai , Kazumi Kasaura , Kenta Hoshino , Yohei Hosoe , Yutaka Matsuo

Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

We study reinforcement learning with linear function approximation where the transition probability and reward functions are linear with respect to a feature mapping $\boldsymbol{\phi}(s,a)$. Specifically, we consider the episodic…

Machine Learning · Computer Science 2023-01-31 Pihe Hu , Yu Chen , Longbo Huang

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

In many applications of Reinforcement Learning (RL), it is critically important that the algorithm performs safely, such that instantaneous hard constraints are satisfied at each step, and unsafe states and actions are avoided. However,…

Machine Learning · Computer Science 2023-02-10 Ming Shi , Yingbin Liang , Ness Shroff

Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs

Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB)…

Machine Learning · Computer Science 2024-11-19 Philips George John , Arnab Bhattacharyya , Silviu Maniu , Dimitrios Myrisiotis , Zhenan Wu

Nonstationary Reinforcement Learning with Linear Function Approximation

We consider reinforcement learning (RL) in episodic Markov decision processes (MDPs) with linear function approximation under drifting environment. Specifically, both the reward and state transition functions can evolve over time but their…

Machine Learning · Computer Science 2024-04-16 Huozhi Zhou , Jinglin Chen , Lav R. Varshney , Ashish Jagmohan

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step. Existing studies have considered safe RL with hard…

Machine Learning · Computer Science 2023-12-25 Honghao Wei , Xin Liu , Lei Ying

Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints

We study reinforcement learning (RL) with linear function approximation under the adaptivity constraint. We consider two popular limited adaptivity models: the batch learning model and the rare policy switch model, and propose two efficient…

Machine Learning · Computer Science 2022-01-04 Tianhao Wang , Dongruo Zhou , Quanquan Gu

Provably Efficient Lifelong Reinforcement Learning with Linear Function Approximation

We study lifelong reinforcement learning (RL) in a regret minimization setting of linear contextual Markov decision process (MDP), where the agent needs to learn a multi-task policy while solving a streaming sequence of tasks. We propose an…

Machine Learning · Computer Science 2022-06-02 Sanae Amani , Lin F. Yang , Ching-An Cheng

Online Reinforcement Learning in Markov Decision Process Using Linear Programming

We consider online reinforcement learning in episodic Markov decision process (MDP) with unknown transition function and stochastic rewards drawn from some fixed but unknown distribution. The learner aims to learn the optimal policy and…

Machine Learning · Computer Science 2024-03-12 Vincent Leon , S. Rasoul Etesami

Linear Stochastic Bandits Under Safety Constraints

Bandit algorithms have various application in safety-critical systems, where it is important to respect the system constraints that rely on the bandit's unknown parameters at every round. In this paper, we formulate a linear stochastic…

Machine Learning · Computer Science 2019-08-19 Sanae Amani , Mahnoosh Alizadeh , Christos Thrampoulidis

Provably Efficient Reinforcement Learning with Linear Function Approximation

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of…

Machine Learning · Computer Science 2019-08-09 Chi Jin , Zhuoran Yang , Zhaoran Wang , Michael I. Jordan

Provably Safe Reinforcement Learning for Stochastic Reach-Avoid Problems with Entropy Regularization

We consider the problem of learning the optimal policy for Markov decision processes with safety constraints. We formulate the problem in a reach-avoid setup. Our goal is to design online reinforcement learning algorithms that ensure safety…

Machine Learning · Computer Science 2026-01-21 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

A Lyapunov-based Approach to Safe Reinforcement Learning

In many real-world reinforcement learning (RL) problems, besides optimizing the main objective function, an agent must concurrently avoid violating a number of constraints. In particular, besides optimizing performance it is crucial to…

Machine Learning · Computer Science 2018-05-22 Yinlam Chow , Ofir Nachum , Edgar Duenez-Guzman , Mohammad Ghavamzadeh

Safe Reinforcement Learning in Constrained Markov Decision Processes

Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications. In this paper, we propose an algorithm, SNO-MDP, that explores and optimizes Markov decision…

Machine Learning · Computer Science 2020-08-18 Akifumi Wachi , Yanan Sui

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given…

Machine Learning · Computer Science 2023-11-07 Jiafan He , Heyang Zhao , Dongruo Zhou , Quanquan Gu

Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints

In safe Reinforcement Learning (RL), safety cost is typically defined as a function dependent on the immediate state and actions. In practice, safety constraints can often be non-Markovian due to the insufficient fidelity of state…

Machine Learning · Computer Science 2024-05-07 Siow Meng Low , Akshat Kumar

Online RL in Linearly $q^\pi$-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore

We consider online reinforcement learning (RL) in episodic Markov decision processes (MDPs) under the linear $q^\pi$-realizability assumption, where it is assumed that the action-values of all policies can be expressed as linear functions…

Machine Learning · Computer Science 2023-12-21 Gellért Weisz , András György , Csaba Szepesvári

Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time

In this paper, we present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint. Despite the necessary attention of the scientific community, considering stochastic stopping time, the…

Machine Learning · Computer Science 2024-03-26 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

Provably Optimal Reinforcement Learning under Safety Filtering

Recent advances in reinforcement learning (RL) enable its use on increasingly complex tasks, but the lack of formal safety guarantees still limits its application in safety-critical settings. A common practical approach is to augment the RL…

Machine Learning · Computer Science 2026-02-12 Donggeon David Oh , Duy P. Nguyen , Haimin Hu , Jaime F. Fisac