English
Related papers

Related papers: An Online Prediction Algorithm for Reinforcement L…

200 papers

In this paper, we provide a new algorithm for the problem of prediction in Reinforcement Learning, \emph{i.e.}, estimating the Value Function of a Markov Reward Process (MRP) using the linear function approximation architecture, with memory…

Systems and Control · Computer Science 2016-09-30 Ajin George Joseph , Shalabh Bhatnagar

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or…

Machine Learning · Computer Science 2025-05-20 Jiashuo Jiang , Yiming Zong , Yinyu Ye

We present an iterative inverse reinforcement learning algorithm to infer optimal cost functions in continuous spaces. Based on a popular maximum entropy criteria, our approach iteratively finds a weight improvement step and proposes a…

Machine Learning · Computer Science 2025-05-14 Sarmad Mehrdad , Avadesh Meduri , Ludovic Righetti

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

In deep Reinforcement Learning (RL), value functions are typically approximated using deep neural networks and trained via mean squared error regression objectives to fit the true value functions. Recent research has proposed an alternative…

Machine Learning · Computer Science 2024-11-19 Denis Tarasov , Kirill Brilliantov , Dmitrii Kharlapenko

In this paper, we present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint. Despite the necessary attention of the scientific community, considering stochastic stopping time, the…

Machine Learning · Computer Science 2024-03-26 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

In this paper, we present an online reinforcement learning algorithm, called Renewal Monte Carlo (RMC), for infinite horizon Markov decision processes with a designated start state. RMC is a Monte Carlo algorithm and retains the advantages…

Machine Learning · Computer Science 2018-04-05 Jayakumar Subramanian , Aditya Mahajan

Reinforcement Learning aims at identifying and evaluating efficient control policies from data. In many real-world applications, the learner is not allowed to experiment and cannot gather data in an online manner (this is the case when…

Machine Learning · Computer Science 2024-07-02 Daniele Foffano , Alessio Russo , Alexandre Proutiere

We consider the problem of learning the optimal policy for Markov decision processes with safety constraints. We formulate the problem in a reach-avoid setup. Our goal is to design online reinforcement learning algorithms that ensure safety…

Machine Learning · Computer Science 2026-01-21 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

Reinforcement learning with outcome-based feedback faces a fundamental challenge: when rewards are only observed at trajectory endpoints, how do we assign credit to the right actions? This paper provides the first comprehensive analysis of…

Machine Learning · Computer Science 2025-07-25 Fan Chen , Zeyu Jia , Alexander Rakhlin , Tengyang Xie

The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide both optimization techniques and architectures for approximating nonlinear…

Machine Learning · Computer Science 2021-06-21 John D. Martin , Joseph Modayil

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency…

The cross entropy (CE) method is a model based search method to solve optimization problems where the objective function has minimal structure. The Monte-Carlo version of the CE method employs the naive sample averaging technique which is…

Artificial Intelligence · Computer Science 2018-02-01 Ajin George Joseph , Shalabh Bhatnagar

To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider…

Machine Learning · Computer Science 2018-11-29 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

We consider online reinforcement learning in episodic Markov decision process (MDP) with unknown transition function and stochastic rewards drawn from some fixed but unknown distribution. The learner aims to learn the optimal policy and…

Machine Learning · Computer Science 2024-03-12 Vincent Leon , S. Rasoul Etesami

A recently popular approach to solving reinforcement learning is with data from human preferences. In fact, human preference data are now used with classic reinforcement learning algorithms such as actor-critic methods, which involve…

Machine Learning · Computer Science 2024-02-28 Zihao Li , Xiang Ji , Minshuo Chen , Mengdi Wang

We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a…

Machine Learning · Computer Science 2022-08-23 The Viet Bui , Tien Mai , Patrick Jaillet

Reinforcement learning has traditionally been studied with exponential discounting or the average reward setup, mainly due to their mathematical tractability. However, such frameworks fall short of accurately capturing human behavior, which…

Machine Learning · Computer Science 2024-09-18 S. R. Eshwar , Mayank Motwani , Nibedita Roy , Gugan Thoppe

We study reinforcement learning from human feedback in general Markov decision processes, where agents learn from trajectory-level preference comparisons. A central challenge in this setting is to design algorithms that select informative…

Machine Learning · Computer Science 2025-12-05 Andreas Schlaginhaufen , Reda Ouhamma , Maryam Kamgarpour

We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes. Particularly, our work…

Machine Learning · Computer Science 2024-07-11 Dake Zhang , Boxiang Lyu , Shuang Qiu , Mladen Kolar , Tong Zhang
‹ Prev 1 2 3 10 Next ›