Related papers: Generative Adversarial Imitation Learning

Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation

We consider the problem of imitation learning from a finite set of expert trajectories, without access to reinforcement signals. The classical approach of extracting the expert's reward function via inverse reinforcement learning, followed…

Machine Learning · Computer Science 2019-06-10 Ruohan Wang , Carlo Ciliberto , Pierluigi Amadori , Yiannis Demiris

Adversarial Imitation via Variational Inverse Reinforcement Learning

We consider a problem of learning the reward and policy from expert examples under unknown dynamics. Our proposed method builds on the framework of generative adversarial networks and introduces the empowerment-regularized maximum-entropy…

Machine Learning · Computer Science 2019-02-26 Ahmed H. Qureshi , Byron Boots , Michael C. Yip

Event Extraction with Generative Adversarial Imitation Learning

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the…

Computation and Language · Computer Science 2018-04-24 Tongtao Zhang , Heng Ji

Imitation Learning by Reinforcement Learning

Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical…

Machine Learning · Statistics 2022-03-16 Kamil Ciosek

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning

Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a…

Machine Learning · Computer Science 2017-11-28 Peter Henderson , Wei-Di Chang , Pierre-Luc Bacon , David Meger , Joelle Pineau , Doina Precup

Multi-Agent Generative Adversarial Imitation Learning

Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal. However, most existing approaches are not applicable in multi-agent settings due to the existence of multiple (Nash)…

Machine Learning · Computer Science 2018-07-27 Jiaming Song , Hongyu Ren , Dorsa Sadigh , Stefano Ermon

Adversarial Imitation Learning via Random Search

Developing agents that can perform challenging complex tasks is the goal of reinforcement learning. The model-free reinforcement learning has been considered as a feasible solution. However, the state of the art research has been to develop…

Machine Learning · Computer Science 2020-08-24 MyungJae Shin , Joongheon Kim

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator. This…

Machine Learning · Computer Science 2021-04-19 Paul Barde , Julien Roy , Wonseok Jeon , Joelle Pineau , Christopher Pal , Derek Nowrouzezahrai

Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Reinforcement learning is well suited for optimizing policies of recommender systems. Current solutions mostly focus on model-free approaches, which require frequent interactions with the real environment, and thus are expensive in model…

Machine Learning · Computer Science 2020-01-22 Xueying Bai , Jian Guan , Hongning Wang

Learning human behaviors from motion capture by adversarial imitation

Rapid progress in deep reinforcement learning has made it increasingly feasible to train controllers for high-dimensional humanoid bodies. However, methods that use pure reinforcement learning with simple reward functions tend to produce…

Robotics · Computer Science 2017-07-11 Josh Merel , Yuval Tassa , Dhruva TB , Sriram Srinivasan , Jay Lemmon , Ziyu Wang , Greg Wayne , Nicolas Heess

Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching

In inverse reinforcement learning (IRL), an agent seeks to replicate expert demonstrations through interactions with the environment. Traditionally, IRL is treated as an adversarial game, where an adversary searches over reward models, and…

Machine Learning · Computer Science 2025-04-23 Arnav Kumar Jain , Harley Wiltzer , Jesse Farebrother , Irina Rish , Glen Berseth , Sanjiban Choudhury

Reward-Conditioned Policies

Reinforcement learning offers the promise of automating the acquisition of complex behavioral skills. However, compared to commonly used and well-understood supervised learning methods, reinforcement learning algorithms can be brittle,…

Machine Learning · Computer Science 2020-01-01 Aviral Kumar , Xue Bin Peng , Sergey Levine

Reinforcement and Imitation Learning via Interactive No-Regret Learning

Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach…

Machine Learning · Computer Science 2014-06-24 Stephane Ross , J. Andrew Bagnell

Structured Imitation Learning of Interactive Policies through Inverse Games

Generative model-based imitation learning methods have recently achieved strong results in learning high-complexity motor skills from human demonstrations. However, imitation learning of interactive policies that coordinate with humans in…

Robotics · Computer Science 2025-11-18 Max M. Sun , Todd Murphey

Error Bounds of Imitating Policies and Environments

Imitation learning trains a policy by mimicking expert demonstrations. Various imitation methods were proposed and empirically evaluated, meanwhile, their theoretical understanding needs further studies. In this paper, we firstly analyze…

Machine Learning · Computer Science 2020-10-23 Tian Xu , Ziniu Li , Yang Yu

Inverse Reinforcement Learning from a Gradient-based Learner

Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behavior, but we also observe part of her…

Machine Learning · Computer Science 2021-09-03 Giorgia Ramponi , Gianluca Drappo , Marcello Restelli

Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference

Recent advances in reinforcement learning have inspired increasing interest in learning user modeling adaptively through dynamic interactions, e.g., in reinforcement learning based recommender systems. Reward function is crucial for most of…

Machine Learning · Computer Science 2021-05-06 Xiaocong Chen , Lina Yao , Xianzhi Wang , Aixin Sun , Wenjie Zhang , Quan Z. Sheng

Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods

In this paper we propose a novel gradient algorithm to learn a policy from an expert's observed behavior assuming that the expert behaves optimally with respect to some unknown reward function of a Markovian Decision Problem. The…

Machine Learning · Computer Science 2012-06-26 Gergely Neu , Csaba Szepesvari

Model-Free Imitation Learning with Policy Optimization

In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or…

Machine Learning · Computer Science 2016-06-17 Jonathan Ho , Jayesh K. Gupta , Stefano Ermon

Limitation Learning: Catching Adverse Dialog with GAIL

Imitation learning is a proven method for creating a policy in the absence of rewards, by leveraging expert demonstrations. In this work, we apply imitation learning to conversation. In doing so, we recover a policy capable of talking to a…

Computation and Language · Computer Science 2025-08-19 Noah Kasmanoff , Rahul Zalkikar