Related papers: On the function approximation error for risk-sensi…

Risk-sensitive reinforcement learning using expectiles, shortfall risk and optimized certainty equivalent risk

We propose risk-sensitive reinforcement learning algorithms catering to three families of risk measures, namely expectiles, utility-based shortfall risk and optimized certainty equivalent risk. For each risk measure, in the context of a…

Machine Learning · Computer Science 2026-02-11 Sumedh Gupte , Shrey Rakeshkumar Patel , Soumen Pachal , Prashanth L. A. , Sanjay P. Bhat

Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces

This work studies discrete-time discounted Markov decision processes with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior. We first consider the case in which…

Optimization and Control · Mathematics 2024-05-27 Angeliki Kamoutsi , Peter Schmitt-Förster , Tobias Sutter , Volkan Cevher , John Lygeros

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models

In this paper we study a model-based approach to calculating approximately optimal policies in Markovian Decision Processes. In particular, we derive novel bounds on the loss of using a policy derived from a factored linear model, a class…

Machine Learning · Statistics 2016-09-22 Bernardo Ávila Pires , Csaba Szepesvári

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance…

Artificial Intelligence · Computer Science 2018-06-26 Daniel S. Brown , Scott Niekum

Reinforcement Learning with Function Approximation for Non-Markov Processes

We study reinforcement learning methods with linear function approximation under non-Markov state and cost processes. We first consider the policy evaluation method and show that the algorithm converges under suitable ergodicity conditions…

Machine Learning · Computer Science 2026-01-05 Ali Devran Kara

Provably Efficient Reinforcement Learning via Surprise Bound

Value function approximation is important in modern reinforcement learning (RL) problems especially when the state space is (infinitely) large. Despite the importance and wide applicability of value function approximation, its theoretical…

Machine Learning · Computer Science 2023-02-24 Hanlin Zhu , Ruosong Wang , Jason D. Lee

Polynomial-Time Approximability of Constrained Reinforcement Learning

We study the computational complexity of approximating general constrained Markov decision processes. Our primary contribution is the design of a polynomial time $(0,\epsilon)$-additive bicriteria approximation algorithm for finding optimal…

Data Structures and Algorithms · Computer Science 2025-02-12 Jeremy McMahan

An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes

In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or…

Machine Learning · Computer Science 2025-08-05 Soumyajit Guin , Vivek S. Borkar , Shalabh Bhatnagar

Approximating Euclidean by Imprecise Markov Decision Processes

Euclidean Markov decision processes are a powerful tool for modeling control problems under uncertainty over continuous domains. Finite state imprecise, Markov decision processes can be used to approximate the behavior of these infinite…

Artificial Intelligence · Computer Science 2020-06-29 Manfred Jaeger , Giorgio Bacci , Giovanni Bacci , Kim Guldstrand Larsen , Peter Gjøl Jensen

Risk-Sensitive Reinforcement Learning with Exponential Criteria

While reinforcement learning has shown experimental success in a number of applications, it is known to be sensitive to noise and perturbations in the parameters of the system, leading to high variance in the total reward amongst different…

Systems and Control · Electrical Eng. & Systems 2024-12-02 Erfaun Noorani , Christos Mavridis , John Baras

Approximation Capabilities of Feedforward Neural Networks with GELU Activations

We derive an approximation error bound that holds simultaneously for a function and all its derivatives up to any prescribed order. The bounds apply to elementary functions, including multivariate polynomials, the exponential function, and…

Machine Learning · Computer Science 2025-12-29 Konstantin Yakovlev , Nikita Puchkin

Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism

This paper studies the safe reinforcement learning problem formulated as an episodic finite-horizon tabular constrained Markov decision process with an unknown transition kernel and stochastic reward and cost functions. We propose a…

Machine Learning · Computer Science 2024-10-15 Kihyun Yu , Duksang Lee , William Overman , Dabeen Lee

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

Learning POMDPs with Linear Function Approximation and Finite Memory

We study reinforcement learning with linear function approximation and finite-memory approximations for partially observed Markov decision processes (POMDPs). We first present an algorithm for the value evaluation of finite-memory feedback…

Optimization and Control · Mathematics 2025-05-22 Ali Devran Kara

Reinforcement Learning with Almost Sure Constraints

In this work we address the problem of finding feasible policies for Constrained Markov Decision Processes under probability one constraints. We argue that stationary policies are not sufficient for solving this problem, and that a rich…

Machine Learning · Computer Science 2023-02-14 Agustin Castellano , Hancheng Min , Juan Bazerque , Enrique Mallada

Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension

Value function approximation has demonstrated phenomenal empirical success in reinforcement learning (RL). Nevertheless, despite a handful of recent progress on developing theory for RL with linear function approximation, the understanding…

Machine Learning · Computer Science 2020-06-22 Ruosong Wang , Ruslan Salakhutdinov , Lin F. Yang

Robust Bayesian reinforcement learning through tight lower bounds

In the Bayesian approach to sequential decision making, exact calculation of the (subjective) utility is intractable. This extends to most special cases of interest, such as reinforcement learning problems. While utility bounds are known to…

Machine Learning · Computer Science 2011-11-14 Christos Dimitrakakis

High-confidence error estimates for learned value functions

Estimating the value function for a fixed policy is a fundamental problem in reinforcement learning. Policy evaluation algorithms---to estimate value functions---continue to be developed, to improve convergence rates, improve stability and…

Machine Learning · Statistics 2018-08-29 Touqir Sajed , Wesley Chung , Martha White

On the Sample Complexity of Discounted Reinforcement Learning with Optimized Certainty Equivalents

We study risk-sensitive reinforcement learning in finite discounted MDPs, where a generative model of the MDP is assumed to be available. We consider a family or risk measures called the optimized certainty equivalent (OCE), which includes…

Machine Learning · Computer Science 2026-05-22 Oliver Mortensen , Mohammad Sadegh Talebi

Provably Safe Reinforcement Learning for Stochastic Reach-Avoid Problems with Entropy Regularization

We consider the problem of learning the optimal policy for Markov decision processes with safety constraints. We formulate the problem in a reach-avoid setup. Our goal is to design online reinforcement learning algorithms that ensure safety…

Machine Learning · Computer Science 2026-01-21 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu