Related papers: Benchmarking Batch Deep Reinforcement Learning Alg…

Off-Policy Deep Reinforcement Learning without Exploration

Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection. In this paper, we demonstrate that due to…

Machine Learning · Computer Science 2019-08-13 Scott Fujimoto , David Meger , Doina Precup

Automatic Reward Shaping from Confounded Offline Data

A key task in Artificial Intelligence is learning effective policies for controlling agents in unknown environments to optimize performance measures. Off-policy learning methods, like Q-learning, allow learners to make optimal decisions…

Artificial Intelligence · Computer Science 2025-09-10 Mingxuan Li , Junzhe Zhang , Elias Bareinboim

Confounding Robust Deep Reinforcement Learning: A Causal Approach

A key task in Artificial Intelligence is learning effective policies for controlling agents in unknown environments to optimize performance measures. Off-policy learning methods, like Q-learning, allow learners to make optimal decisions…

Artificial Intelligence · Computer Science 2025-10-27 Mingxuan Li , Junzhe Zhang , Elias Bareinboim

Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari

Reinforcement learning (RL) has seen great advancements in the past few years. Nevertheless, the consensus among the RL community is that currently used methods, despite all their benefits, suffer from extreme data inefficiency, especially…

Machine Learning · Computer Science 2020-04-01 Kacper Kielak

An Optimistic Perspective on Offline Reinforcement Learning

Off-policy reinforcement learning (RL) using a fixed offline dataset of logged interactions is an important consideration in real world applications. This paper studies offline RL using the DQN replay dataset comprising the entire replay…

Machine Learning · Computer Science 2020-11-25 Rishabh Agarwal , Dale Schuurmans , Mohammad Norouzi

Deep Reinforcement Learning with Double Q-learning

The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can…

Machine Learning · Computer Science 2015-12-10 Hado van Hasselt , Arthur Guez , David Silver

Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning

Off-policy reinforcement learning algorithms promise to be applicable in settings where only a fixed data-set (batch) of environment interactions is available and no new experience can be acquired. This property makes these algorithms…

Machine Learning · Computer Science 2020-06-18 Noah Y. Siegel , Jost Tobias Springenberg , Felix Berkenkamp , Abbas Abdolmaleki , Michael Neunert , Thomas Lampe , Roland Hafner , Nicolas Heess , Martin Riedmiller

Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains

Reinforcement learning algorithms have had tremendous successes in online learning settings. However, these successes have relied on low-stakes interactions between the algorithmic agent and its environment. In many settings where RL could…

Machine Learning · Computer Science 2020-06-05 James Bannon , Brad Windsor , Wenbo Song , Tao Li

Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation

In the past few years, deep reinforcement learning has been proven to solve problems which have complex states like video games or board games. The next step of intelligent agents would be able to generalize between tasks, and using prior…

Machine Learning · Computer Science 2018-09-05 Shu-Hsuan Hsu , I-Chao Shen , Bing-Yu Chen

Policy Distillation

Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve…

Machine Learning · Computer Science 2016-01-08 Andrei A. Rusu , Sergio Gomez Colmenarejo , Caglar Gulcehre , Guillaume Desjardins , James Kirkpatrick , Razvan Pascanu , Volodymyr Mnih , Koray Kavukcuoglu , Raia Hadsell

Learning Dialog Policies from Weak Demonstrations

Deep reinforcement learning is a promising approach to training a dialog manager, but current methods struggle with the large state and action spaces of multi-domain dialog systems. Building upon Deep Q-learning from Demonstrations (DQfD),…

Computation and Language · Computer Science 2020-08-14 Gabriel Gordon-Hall , Philip John Gorinski , Shay B. Cohen

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not…

Machine Learning · Computer Science 2020-03-18 Sina Ghiassian , Banafsheh Rafiee , Yat Long Lo , Adam White

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog

Most deep reinforcement learning (RL) systems are not able to learn effectively from off-policy data, especially if they cannot explore online in the environment. These are critical shortcomings for applying RL to real-world problems where…

Machine Learning · Computer Science 2019-07-09 Natasha Jaques , Asma Ghandeharioun , Judy Hanwen Shen , Craig Ferguson , Agata Lapedriza , Noah Jones , Shixiang Gu , Rosalind Picard

Deep Reinforcement Learning With Macro-Actions

Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation…

Machine Learning · Computer Science 2016-06-16 Ishan P. Durugkar , Clemens Rosenbaum , Stefan Dernbach , Sridhar Mahadevan

Interpretable performance analysis towards offline reinforcement learning: A dataset perspective

Offline reinforcement learning (RL) has increasingly become the focus of the artificial intelligent research due to its wide real-world applications where the collection of data may be difficult, time-consuming, or costly. In this paper, we…

Machine Learning · Computer Science 2021-05-13 Chenyang Xi , Bo Tang , Jiajun Shen , Xinfu Liu , Feiyu Xiong , Xueying Li

Reinforcement Learning and Video Games

Reinforcement learning has exceeded human-level performance in game playing AI with deep learning methods according to the experiments from DeepMind on Go and Atari games. Deep learning solves high dimension input problems which stop the…

Machine Learning · Computer Science 2019-09-12 Yue Zheng

Deep Reinforcement Learning in Applied Control: Challenges, Analysis, and Insights

Over the past decade, remarkable progress has been made in adopting deep neural networks to enhance the performance of conventional reinforcement learning. A notable milestone was the development of Deep Q-Networks (DQN), which achieved…

Systems and Control · Electrical Eng. & Systems 2025-07-14 Klinsmann Agyei , Pouria Sarhadi , Daniel Polani

Domain Adaptation for Reinforcement Learning on the Atari

Deep reinforcement learning agents have recently been successful across a variety of discrete and continuous control tasks; however, they can be slow to train and require a large number of interactions with the environment to learn a…

Machine Learning · Computer Science 2018-12-19 Thomas Carr , Maria Chli , George Vogiatzis

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to…

Artificial Intelligence · Computer Science 2017-08-18 Felix Leibfried , Nate Kushman , Katja Hofmann

Generalization and Regularization in DQN

Deep reinforcement learning algorithms have shown an impressive ability to learn complex control policies in high-dimensional tasks. However, despite the ever-increasing performance on popular benchmarks, policies learned by deep…

Machine Learning · Computer Science 2020-01-22 Jesse Farebrother , Marlos C. Machado , Michael Bowling