English
Related papers

Related papers: Policy Gradient Bayesian Robust Optimization for I…

200 papers

One of the main challenges in imitation learning is determining what action an agent should take when outside the state distribution of the demonstrations. Inverse reinforcement learning (IRL) can enable generalization to new states by…

Machine Learning · Computer Science 2024-03-04 Daniel S. Brown , Scott Niekum , Marek Petrik

Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper,…

Inverse reinforcement learning (IRL) addresses the problem of recovering a task description given a demonstration of the optimal policy used to solve such a task. The optimal policy is usually provided by an expert or teacher, making IRL…

Machine Learning · Computer Science 2012-02-09 Héctor Ratia , Luis Montesano , Ruben Martinez-Cantin

Policy gradient methods, which have been extensively studied in the last decade, offer an effective and efficient framework for reinforcement learning problems. However, their performances can often be unsatisfactory, suffering from…

Machine Learning · Computer Science 2026-01-27 Shihab Ahmed , El Houcine Bergou , Aritra Dutta , Yue Wang

We consider (stochastic) softmax policy gradient (PG) methods for bandits and tabular Markov decision processes (MDPs). While the PG objective is non-concave, recent research has used the objective's smoothness and gradient domination…

Machine Learning · Computer Science 2024-10-01 Michael Lu , Matin Aghaei , Anant Raj , Sharan Vaswani

We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of…

Machine Learning · Computer Science 2024-06-25 Nithia Vijayan , Prashanth L. A

This paper develops the first policy gradient method with global optimality guarantee and complexity analysis for robust reinforcement learning under model mismatch. Robust reinforcement learning is to learn a policy robust to model…

Machine Learning · Computer Science 2022-05-17 Yue Wang , Shaofeng Zou

Policy Gradient (PG) algorithms are among the best candidates for the much-anticipated applications of reinforcement learning to real-world control tasks, such as robotics. However, the trial-and-error nature of these methods poses safety…

Machine Learning · Computer Science 2022-06-20 Matteo Papini , Matteo Pirotta , Marcello Restelli

Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between agents and environments. It does not require explicit reward signals and instead tries to recover desired policies using expert demonstrations. In…

Machine Learning · Computer Science 2021-12-14 Yang Liu , Yongzhe Chang , Shilei Jiang , Xueqian Wang , Bin Liang , Bo Yuan

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are…

Machine Learning · Computer Science 2023-12-07 Joe Watson , Sandy H. Huang , Nicolas Heess

This paper considers the problem of learning safe policies in the context of reinforcement learning (RL). In particular, we consider the notion of probabilistic safety. This is, we aim to design policies that maintain the state of the…

Machine Learning · Computer Science 2023-04-20 Weiqin Chen , Dharmashankar Subramanian , Santiago Paternain

Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment…

Machine Learning · Computer Science 2023-10-11 Siddhant Agarwal , Ishan Durugkar , Peter Stone , Amy Zhang

Direct policy gradient methods for reinforcement learning are a successful approach for a variety of reasons: they are model free, they directly optimize the performance metric of interest, and they allow for richly parameterized policies.…

Machine Learning · Computer Science 2020-08-14 Alekh Agarwal , Mikael Henaff , Sham Kakade , Wen Sun

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion, which minimizes the probability of a catastrophic failure. Unfortunately,…

Machine Learning · Computer Science 2021-03-01 Elita A. Lobo , Mohammad Ghavamzadeh , Marek Petrik

Imitation Learning (IL) has proven highly effective for robotic and control tasks where manually designing reward functions or explicit controllers is infeasible. However, standard IL methods implicitly assume that the environment dynamics…

Machine Learning · Computer Science 2025-11-12 Rishabh Agrawal , Yusuf Alvi , Rahul Jain , Ashutosh Nayyar

Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories, rather than explicit reward signals. While PbRL has demonstrated…

Machine Learning · Computer Science 2024-04-18 Wenhao Zhan , Masatoshi Uehara , Wen Sun , Jason D. Lee

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

Reinforcement learning (RL) with sparse and deceptive rewards is challenging because non-zero rewards are rarely obtained. Hence, the gradient calculated by the agent can be stochastic and without valid information. Recent studies that…

Machine Learning · Computer Science 2024-02-08 Guojian Wang , Faguo Wu , Xiao Zhang , Jianxiang Liu

The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed…

Machine Learning · Computer Science 2020-11-18 Sreejith Balakrishnan , Quoc Phong Nguyen , Bryan Kian Hsiang Low , Harold Soh

Sequential Bayesian optimal experimental design (SBOED) for PDE-governed inverse problems is computationally challenging, especially for infinite-dimensional random field parameters. High-fidelity approaches require repeated forward and…

Optimization and Control · Mathematics 2026-01-12 Kaichen Shen , Peng Chen
‹ Prev 1 2 3 10 Next ›