Related papers: Parameter Critic: a Model Free Variance Reduction …

Model-Free Imitation Learning with Policy Optimization

In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or…

Machine Learning · Computer Science 2016-06-17 Jonathan Ho , Jayesh K. Gupta , Stefano Ermon

Efficient Sample Reuse in Policy Gradients with Parameter-based Exploration

The policy gradient approach is a flexible and powerful reinforcement learning method particularly for problems with continuous actions such as robot control. A common challenge in this scenario is how to reduce the variance of policy…

Machine Learning · Computer Science 2013-01-18 Tingting Zhao , Hirotaka Hachiya , Voot Tangkaratt , Jun Morimoto , Masashi Sugiyama

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

Policy optimization methods have shown great promise in solving complex reinforcement and imitation learning tasks. While model-free methods are broadly applicable, they often require many samples to optimize complex policies. Model-based…

Artificial Intelligence · Computer Science 2017-11-23 Daniel Levy , Stefano Ermon

Uncertainty-aware Model-based Policy Optimization

Model-based reinforcement learning has the potential to be more sample efficient than model-free approaches. However, existing model-based methods are vulnerable to model bias, which leads to poor generalization and asymptotic performance…

Machine Learning · Computer Science 2019-06-27 Tung-Long Vuong , Kenneth Tran

Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation

The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples.…

Machine Learning · Statistics 2013-07-22 Syogo Mori , Voot Tangkaratt , Tingting Zhao , Jun Morimoto , Masashi Sugiyama

Model-Augmented Actor-Critic: Backpropagating through Paths

Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator to augment the data for policy optimization or value function learning. In this paper, we show how to make more effective use of the…

Machine Learning · Computer Science 2020-05-19 Ignasi Clavera , Violet Fu , Pieter Abbeel

Model-Free Active Exploration in Reinforcement Learning

We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. We adopt an information-theoretical viewpoint and start from the instance-specific lower bound of the number of samples that have to be…

Machine Learning · Computer Science 2024-07-02 Alessio Russo , Alexandre Proutiere

Model-based Policy Optimization using Symbolic World Model

The application of learning-based control methods in robotics presents significant challenges. One is that model-free reinforcement learning algorithms use observation data with low sample efficiency. To address this challenge, a prevalent…

Machine Learning · Computer Science 2024-07-19 Andrey Gorodetskiy , Konstantin Mironov , Aleksandr Panov

Adversarial Imitation via Variational Inverse Reinforcement Learning

We consider a problem of learning the reward and policy from expert examples under unknown dynamics. Our proposed method builds on the framework of generative adversarial networks and introduces the empowerment-regularized maximum-entropy…

Machine Learning · Computer Science 2019-02-26 Ahmed H. Qureshi , Byron Boots , Michael C. Yip

Reinforcement Learning with an Abrupt Model Change

The problem of reinforcement learning is considered where the environment or the model undergoes a change. An algorithm is proposed that an agent can apply in such a problem to achieve the optimal long-time discounted reward. The algorithm…

Systems and Control · Electrical Eng. & Systems 2023-04-25 Wuxia Chen , Taposh Banerjee , Jemin George , Carl Busart

On the model-based stochastic value gradient for continuous reinforcement learning

For over a decade, model-based reinforcement learning has been seen as a way to leverage control-based domain knowledge to improve the sample-efficiency of reinforcement learning agents. While model-based agents are conceptually appealing,…

Machine Learning · Computer Science 2021-05-28 Brandon Amos , Samuel Stanton , Denis Yarats , Andrew Gordon Wilson

An Actor Critic Method for Free Terminal Time Optimal Control

Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose…

Optimization and Control · Mathematics 2022-08-08 Evan Burton , Tenavi Nakamura-Zimmerer , Qi Gong , Wei Kang

On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps…

Machine Learning · Computer Science 2023-01-31 Harshat Kumar , Alec Koppel , Alejandro Ribeiro

An Approximate Policy Iteration Viewpoint of Actor-Critic Algorithms

In this work, we consider policy-based methods for solving the reinforcement learning problem, and establish the sample complexity guarantees. A policy-based algorithm typically consists of an actor and a critic. We consider using various…

Machine Learning · Computer Science 2023-01-16 Zaiwei Chen , Siva Theja Maguluri

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with…

Machine Learning · Computer Science 2022-11-02 Dilip Arumugam , Benjamin Van Roy

Model-based Policy Optimization with Unsupervised Model Adaptation

Model-based reinforcement learning methods learn a dynamics model with real data sampled from the environment and leverage it to generate simulated data to derive an agent. However, due to the potential distribution mismatch between…

Machine Learning · Computer Science 2020-10-29 Jian Shen , Han Zhao , Weinan Zhang , Yong Yu

Neural Network Approaches for Parameterized Optimal Control

We consider numerical approaches for deterministic, finite-dimensional optimal control problems whose dynamics depend on unknown or uncertain parameters. We seek to amortize the solution over a set of relevant parameters in an offline stage…

Optimization and Control · Mathematics 2024-02-16 Deepanshu Verma , Nick Winovich , Lars Ruthotto , Bart van Bloemen Waanders

Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation

We consider the problem of imitation learning from a finite set of expert trajectories, without access to reinforcement signals. The classical approach of extracting the expert's reward function via inverse reinforcement learning, followed…

Machine Learning · Computer Science 2019-06-10 Ruohan Wang , Carlo Ciliberto , Pierluigi Amadori , Yiannis Demiris

Batch mode active learning for efficient parameter estimation

For many tasks of data analysis, we may only have the information of the explanatory variable and the evaluation of the response values are quite expensive. While it is impractical or too costly to obtain the responses of all units, a…

Computation · Statistics 2023-04-07 Wei Zheng , Ting Tian , Xueqin Wang

Sample-Efficient Model-based Actor-Critic for an Interactive Dialogue Task

Human-computer interactive systems that rely on machine learning are becoming paramount to the lives of millions of people who use digital assistants on a daily basis. Yet, further advances are limited by the availability of data and the…

Machine Learning · Computer Science 2020-04-29 Katya Kudashkina , Valliappa Chockalingam , Graham W. Taylor , Michael Bowling