Related papers: Demonstration-Regularized RL

Efficient Deep Reinforcement Learning with Imitative Expert Priors for Autonomous Driving

Deep reinforcement learning (DRL) is a promising way to achieve human-like autonomous driving. However, the low sample efficiency and difficulty of designing reward functions for DRL would hinder its applications in practice. In light of…

Robotics · Computer Science 2021-10-29 Zhiyu Huang , Jingda Wu , Chen Lv

Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

In complex environments with high dimension, training a reinforcement learning (RL) model from scratch often suffers from lengthy and tedious collection of agent-environment interactions. Instead, leveraging expert demonstration to guide RL…

Machine Learning · Computer Science 2021-09-28 Zhaorun Chen , Binhao Chen , Shenghan Xie , Liang Gong , Chengliang Liu , Zhengfeng Zhang , Junping Zhang

Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble

Deep reinforcement learning (DRL) provides a new way to generate robot control policy. However, the process of training control policy requires lengthy exploration, resulting in a low sample efficiency of reinforcement learning (RL) in…

Machine Learning · Computer Science 2022-12-08 Chao Li

On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks.…

Machine Learning · Computer Science 2022-12-29 Tim G. J. Rudner , Cong Lu , Michael A. Osborne , Yarin Gal , Yee Whye Teh

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of learning intelligent behavior in rich domains. However, this has largely been done in simulated domains without adequate focus on the process of building the simulator. In…

Machine Learning · Computer Science 2019-10-24 Aditya Modi , Nan Jiang , Ambuj Tewari , Satinder Singh

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

The current paper studies sample-efficient Reinforcement Learning (RL) in settings where only the optimal value function is assumed to be linearly-realizable. It has recently been understood that, even under this seemingly strong assumption…

Machine Learning · Computer Science 2022-07-19 Philip Amortila , Nan Jiang , Dhruv Madeka , Dean P. Foster

Guided Meta-Policy Search

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch. Meta-RL aims to address this challenge by leveraging experience…

Machine Learning · Computer Science 2020-10-28 Russell Mendonca , Abhishek Gupta , Rosen Kralev , Pieter Abbeel , Sergey Levine , Chelsea Finn

Sample-Efficient Reinforcement Learning of Koopman eNMPC

Reinforcement learning (RL) can be used to tune data-driven (economic) nonlinear model predictive controllers ((e)NMPCs) for optimal performance in a specific control task by optimizing the dynamic model or parameters in the policy's…

Machine Learning · Computer Science 2025-05-14 Daniel Mayfrank , Mehmet Velioglu , Alexander Mitsos , Manuel Dahmen

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs. In contrast to existing methods which start with learning from demonstrations (LfD) and then use…

Computation and Language · Computer Science 2018-07-10 Wenhan Xiong , Xiaoxiao Guo , Mo Yu , Shiyu Chang , Bowen Zhou , William Yang Wang

Hybrid Inverse Reinforcement Learning

The inverse reinforcement learning approach to imitation learning is a double-edged sword. On the one hand, it can enable learning from a smaller number of expert demonstrations with more robustness to error compounding than behavioral…

Machine Learning · Computer Science 2024-06-06 Juntao Ren , Gokul Swamy , Zhiwei Steven Wu , J. Andrew Bagnell , Sanjiban Choudhury

Robust Maximum Entropy Behavior Cloning

Imitation learning (IL) algorithms use expert demonstrations to learn a specific task. Most of the existing approaches assume that all expert demonstrations are reliable and trustworthy, but what if there exist some adversarial…

Machine Learning · Computer Science 2021-01-06 Mostafa Hussein , Brendan Crowe , Marek Petrik , Momotaz Begum

A Survey of Demonstration Learning

With the fast improvement of machine learning, reinforcement learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited…

Machine Learning · Computer Science 2023-03-21 André Correia , Luís A. Alexandre

Improved Deep Reinforcement Learning with Expert Demonstrations for Urban Autonomous Driving

Learning-based approaches, such as reinforcement learning (RL) and imitation learning (IL), have indicated superiority over rule-based approaches in complex urban autonomous driving environments, showing great potential to make intelligent…

Robotics · Computer Science 2022-05-31 Haochen Liu , Zhiyu Huang , Jingda Wu , Chen Lv

Model-Ensemble Trust-Region Policy Optimization

Model-free reinforcement learning (RL) methods are succeeding in a growing number of tasks, aided by recent advances in deep learning. However, they tend to suffer from high sample complexity, which hinders their use in real-world domains.…

Machine Learning · Computer Science 2018-10-08 Thanard Kurutach , Ignasi Clavera , Yan Duan , Aviv Tamar , Pieter Abbeel

Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance

In this paper, we study Reinforcement Learning from Demonstrations (RLfD) that improves the exploration efficiency of Reinforcement Learning (RL) by providing expert demonstrations. Most of existing RLfD methods require demonstrations to be…

Machine Learning · Computer Science 2019-11-26 Mingxuan Jing , Xiaojian Ma , Wenbing Huang , Fuchun Sun , Chao Yang , Bin Fang , Huaping Liu

Constrained Meta Reinforcement Learning with Provable Test-Time Safety

Meta reinforcement learning (RL) allows agents to leverage experience across a distribution of tasks on which the agent can train at will, enabling faster learning of optimal policies on new test tasks. Despite its success in improving…

Machine Learning · Computer Science 2026-05-27 Tingting Ni , Maryam Kamgarpour

Teaching Large Language Models to Reason with Reinforcement Learning

Reinforcement Learning from Human Feedback (\textbf{RLHF}) has emerged as a dominant approach for aligning LLM outputs with human preferences. Inspired by the success of RLHF, we study the performance of multiple algorithms that learn from…

Machine Learning · Computer Science 2024-03-08 Alex Havrilla , Yuqing Du , Sharath Chandra Raparthy , Christoforos Nalmpantis , Jane Dwivedi-Yu , Maksym Zhuravinskyi , Eric Hambro , Sainbayar Sukhbaatar , Roberta Raileanu

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Reverse-Kullback-Leibler (KL) regularization has emerged to be a predominant technique used to enhance policy optimization in reinforcement learning (RL) and reinforcement learning from human feedback (RLHF), which forces the learned policy…

Machine Learning · Computer Science 2025-02-12 Heyang Zhao , Chenlu Ye , Quanquan Gu , Tong Zhang

Demonstration-Guided Reinforcement Learning with Learned Skills

Demonstration-guided reinforcement learning (RL) is a promising approach for learning complex behaviors by leveraging both reward feedback and a set of target task demonstrations. Prior approaches for demonstration-guided RL treat every new…

Machine Learning · Computer Science 2021-07-22 Karl Pertsch , Youngwoon Lee , Yue Wu , Joseph J. Lim

Reinforcement Learning with Supervision from Noisy Demonstrations

Reinforcement learning has achieved great success in various applications. To learn an effective policy for the agent, it usually requires a huge amount of data by interacting with the environment, which could be computational costly and…

Machine Learning · Computer Science 2020-06-16 Kun-Peng Ning , Sheng-Jun Huang