English
Related papers

Related papers: Simplifying Deep Temporal Difference Learning

200 papers

In state of the art model-free off-policy deep reinforcement learning, a replay memory is used to store past experience and derive all network updates. Even if both state and action spaces are continuous, the replay memory only holds a…

Machine Learning · Computer Science 2020-07-16 Sabrina Hoppe , Marc Toussaint

Deep Reinforcement Learning (RL) methods rely on experience replay to approximate the minibatched supervised learning setting; however, unlike supervised learning where access to lots of training data is crucial to generalization,…

Machine Learning · Computer Science 2021-02-24 Brett Daley , Cameron Hickert , Christopher Amato

Reinforcement Learning (RL) has opened up new opportunities to enhance existing smart systems that generally include a complex decision-making process. However, modern RL algorithms, e.g., Deep Q-Networks (DQN), are based on deep neural…

Machine Learning · Computer Science 2023-06-22 Yang Ni , Danny Abraham , Mariam Issa , Yeseong Kim , Pietro Mercati , Mohsen Imani

Off-policy reinforcement learning (RL) using a fixed offline dataset of logged interactions is an important consideration in real world applications. This paper studies offline RL using the DQN replay dataset comprising the entire replay…

Machine Learning · Computer Science 2020-11-25 Rishabh Agarwal , Dale Schuurmans , Mohammad Norouzi

Deep Reinforcement Learning (DRL) has shown outstanding performance on inducing effective action policies that maximize expected long-term return on many complex tasks. Much of DRL work has been focused on sequences of events with discrete…

Machine Learning · Computer Science 2021-05-07 Yeo Jin Kim , Min Chi

Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge…

Machine Learning · Computer Science 2023-04-19 Kavosh Asadi , Rasool Fakoor , Omer Gottesman , Taesup Kim , Michael L. Littman , Alexander J. Smola

The core challenge of offline reinforcement learning (RL) is dealing with the (potentially catastrophic) extrapolation error induced by the distribution shift between the history dataset and the desired policy. A large portion of prior work…

Machine Learning · Computer Science 2023-07-27 Laixi Shi , Robert Dadashi , Yuejie Chi , Pablo Samuel Castro , Matthieu Geist

Vision-based deep reinforcement learning (RL) typically obtains performance benefit by using high capacity and relatively large convolutional neural networks (CNN). However, a large network leads to higher inference costs (power, latency,…

Machine Learning · Computer Science 2019-05-01 Sam Green , Craig M. Vineyard , Çetin Kaya Koç

A key task in Artificial Intelligence is learning effective policies for controlling agents in unknown environments to optimize performance measures. Off-policy learning methods, like Q-learning, allow learners to make optimal decisions…

Artificial Intelligence · Computer Science 2025-09-10 Mingxuan Li , Junzhe Zhang , Elias Bareinboim

The $Q$-learning algorithm is a simple and widely-used stochastic approximation scheme for reinforcement learning, but the basic protocol can exhibit instability in conjunction with function approximation. Such instability can be observed…

Machine Learning · Computer Science 2022-06-03 Andrea Zanette , Martin J. Wainwright

Offline reinforcement learning (RL) aims to learn a policy from a static dataset without further interactions with the environment. Collecting sufficiently large datasets for offline RL is exhausting since this data collection requires…

Artificial Intelligence · Computer Science 2025-10-22 Jongchan Park , Mingyu Park , Donghwan Lee

Learning complex policies with Reinforcement Learning (RL) is often hindered by instability and slow convergence, a problem exacerbated by the difficulty of reward engineering. Imitation Learning (IL) from expert demonstrations bypasses…

Machine Learning · Computer Science 2026-05-19 Sayambhu Sen , Shalabh Bhatnagar

Learning the value function of a given policy (target policy) from the data samples obtained from a different policy (behavior policy) is an important problem in Reinforcement Learning (RL). This problem is studied under the setting of…

Machine Learning · Computer Science 2019-11-14 Raghuram Bharadwaj Diddigi , Chandramouli Kamanchi , Shalabh Bhatnagar

Training agents via off-policy deep reinforcement learning (RL) requires a large memory, named replay memory, that stores past experiences used for learning. These experiences are sampled, uniformly or non-uniformly, to create the batches…

Machine Learning · Computer Science 2022-12-27 Bumgeun Park , Taeyoung Kim , Woohyeon Moon , Luiz Felipe Vecchietti , Dongsoo Har

While agents trained by Reinforcement Learning (RL) can solve increasingly challenging tasks directly from visual observations, generalizing learned skills to novel environments remains very challenging. Extensive use of data augmentation…

Machine Learning · Computer Science 2021-12-10 Nicklas Hansen , Hao Su , Xiaolong Wang

Deep Reinforcement Learning (RL) has considerably advanced over the past decade. At the same time, state-of-the-art RL algorithms require a large computational budget in terms of training time to converge. Recent work has started to…

Artificial neural networks are promising for general function approximation but challenging to train on non-independent or non-identically distributed data due to catastrophic forgetting. The experience replay buffer, a standard component…

Machine Learning · Computer Science 2023-04-12 Qingfeng Lan , Yangchen Pan , Jun Luo , A. Rupam Mahmood

This paper introduces the QDQN-DPER framework to enhance the efficiency of quantum reinforcement learning (QRL) in solving sequential decision tasks. The framework incorporates prioritized experience replay and asynchronous training into…

Quantum Physics · Physics 2023-04-20 Samuel Yen-Chi Chen

The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP. However, recent works…

Machine Learning · Computer Science 2023-04-19 Aviral Kumar , Rishabh Agarwal , Xinyang Geng , George Tucker , Sergey Levine

The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited. In this paper, we study the so-called periodic Q-learning…

Machine Learning · Computer Science 2020-02-25 Donghwan Lee , Niao He
‹ Prev 1 2 3 10 Next ›