English
Related papers

Related papers: Model-Augmented Actor-Critic: Backpropagating thro…

200 papers

Deterministic-policy actor-critic algorithms for continuous control improve the actor by plugging its actions into the critic and ascending the action-value gradient, which is obtained by chaining the actor's Jacobian matrix with the…

Artificial Intelligence · Computer Science 2020-10-23 Pierluca D'Oro , Wojciech Jaśkowski

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data. In this paper, we study the role of model usage in policy…

Machine Learning · Computer Science 2021-11-30 Michael Janner , Justin Fu , Marvin Zhang , Sergey Levine

We focus on a simulation-based optimization problem of choosing the best design from the feasible space. Although the simulation model can be queried with finite samples, its internal processing rule cannot be utilized in the optimization…

Machine Learning · Computer Science 2021-11-02 Kuo Li , Qing-Shan Jia , Jiaqi Yan

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps…

Machine Learning · Computer Science 2023-01-31 Harshat Kumar , Alec Koppel , Alejandro Ribeiro

Recent advances in deep reinforcement learning have demonstrated the capability of learning complex control policies from many types of environments. When learning policies for safety-critical applications, it is essential to be sensitive…

Machine Learning · Computer Science 2019-11-12 Yichuan Charlie Tang , Jian Zhang , Ruslan Salakhutdinov

Model-based reinforcement learning has the potential to be more sample efficient than model-free approaches. However, existing model-based methods are vulnerable to model bias, which leads to poor generalization and asymptotic performance…

Machine Learning · Computer Science 2019-06-27 Tung-Long Vuong , Kenneth Tran

Model-free reinforcement learning algorithms can compute policy gradients given sampled environment transitions, but require large amounts of data. In contrast, model-based methods can use the learned model to generate new data, but model…

Machine Learning · Computer Science 2022-03-04 Lukas P. Fröhlich , Maksym Lefarov , Melanie N. Zeilinger , Felix Berkenkamp

This work investigates the formal policy synthesis of continuous-state stochastic dynamic systems given high-level specifications in linear temporal logic. To learn an optimal policy that maximizes the satisfaction probability, we take a…

Artificial Intelligence · Computer Science 2023-04-21 Lening Li , Zhentian Qian

Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly…

Machine Learning · Computer Science 2022-09-05 Yali Du , Chengdong Ma , Yuchen Liu , Runji Lin , Hao Dong , Jun Wang , Yaodong Yang

This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The objective function of the optimization model is the weighted sum of the expectation and value at risk(VaR) of…

Machine Learning · Computer Science 2022-08-30 Boyi Jin

Policy gradient algorithms have proven to be successful in diverse decision making and control tasks. However, these methods suffer from high sample complexity and instability issues. In this paper, we address these challenges by providing…

Machine Learning · Computer Science 2021-03-17 Yannis Flet-Berliac , Reda Ouhamma , Odalric-Ambrym Maillard , Philippe Preux

Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an…

Machine Learning · Computer Science 2019-06-13 Denis Steckelmacher , Hélène Plisnier , Diederik M. Roijers , Ann Nowé

Reinforcement learning algorithms are known to be sample inefficient, and often performance on one task can be substantially improved by leveraging information (e.g., via pre-training) on other related tasks. In this work, we propose a…

Machine Learning · Computer Science 2019-10-15 Jonathan Lebensold , William Hamilton , Borja Balle , Doina Precup

In this work, we consider policy-based methods for solving the reinforcement learning problem, and establish the sample complexity guarantees. A policy-based algorithm typically consists of an actor and a critic. We consider using various…

Machine Learning · Computer Science 2023-01-16 Zaiwei Chen , Siva Theja Maguluri

In recent years, deep off-policy actor-critic algorithms have become a dominant approach to reinforcement learning for continuous control. One of the primary drivers of this improved performance is the use of pessimistic value updates to…

Machine Learning · Computer Science 2022-04-07 Ted Moskovitz , Jack Parker-Holder , Aldo Pacchiano , Michael Arbel , Michael I. Jordan

Reinforcement learning algorithms are typically geared towards optimizing the expected return of an agent. However, in many practical applications, low variance in the return is desired to ensure the reliability of an algorithm. In this…

Machine Learning · Computer Science 2021-02-04 Arushi Jain , Gandharv Patil , Ayush Jain , Khimya Khetarpal , Doina Precup

We study the problem of off-policy critic evaluation in several variants of value-based off-policy actor-critic algorithms. Off-policy actor-critic algorithms require an off-policy critic evaluation step, to estimate the value of the new…

Machine Learning · Computer Science 2019-12-12 Riashat Islam , Raihan Seraj , Samin Yeasar Arnob , Doina Precup

This paper presents the first actor-critic algorithm for off-policy reinforcement learning. Our algorithm is online and incremental, and its per-time-step complexity scales linearly with the number of learned weights. Previous work on…

Machine Learning · Computer Science 2015-03-20 Thomas Degris , Martha White , Richard S. Sutton

Value-based algorithms are a cornerstone of off-policy reinforcement learning due to their simplicity and training stability. However, their use has traditionally been restricted to discrete action spaces, as they rely on estimating…

Machine Learning · Computer Science 2025-10-23 Yigit Korkmaz , Urvi Bhuwania , Ayush Jain , Erdem Bıyık

Substantial advancements to model-based reinforcement learning algorithms have been impeded by the model-bias induced by the collected data, which generally hurts performance. Meanwhile, their inherent sample efficiency warrants utility for…

‹ Prev 1 2 3 10 Next ›