Related papers: Does DQN Learn?

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration

This paper provides a theoretical understanding of Deep Q-Network (DQN) with the $\varepsilon$-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization…

Machine Learning · Computer Science 2023-10-26 Shuai Zhang , Hongkang Li , Meng Wang , Miao Liu , Pin-Yu Chen , Songtao Lu , Sijia Liu , Keerthiram Murugesan , Subhajit Chaudhury

A Theoretical Analysis of Deep Q-Learning

Despite the great empirical success of deep reinforcement learning, its theoretical foundation is less well understood. In this work, we make the first attempt to theoretically understand the deep Q-network (DQN) algorithm (Mnih et al.,…

Machine Learning · Computer Science 2020-02-25 Jianqing Fan , Zhaoran Wang , Yuchen Xie , Zhuoran Yang

Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set

$Q$-learning is one of the most fundamental reinforcement learning algorithms. It is widely believed that $Q$-learning with linear function approximation (i.e., linear $Q$-learning) suffers from possible divergence until the recent work…

Machine Learning · Computer Science 2025-05-28 Xinyu Liu , Zixuan Xie , Shangtong Zhang

Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions,…

Machine Learning · Computer Science 2021-04-13 Arunselvan Ramaswamy , Eyke Hüllermeier

Using Deep Q-Learning to Control Optimization Hyperparameters

We present a novel definition of the reinforcement learning state, actions and reward function that allows a deep Q-network (DQN) to learn to control an optimization hyperparameter. Using Q-learning with experience replay, we train two DQNs…

Optimization and Control · Mathematics 2016-06-21 Samantha Hansen

Convex Q-Learning, Part 1: Deterministic Optimal Control

It is well known that the extension of Watkins' algorithm to general function approximation settings is challenging: does the projected Bellman equation have a solution? If so, is the solution useful in the sense of generating a good…

Optimization and Control · Mathematics 2020-08-11 Prashant G. Mehta , Sean P. Meyn

Convergent and Efficient Deep Q Network Algorithm

Despite the empirical success of the deep Q network (DQN) reinforcement learning algorithm and its variants, DQN is still not well understood and it does not guarantee convergence. In this work, we show that DQN can indeed diverge and cease…

Machine Learning · Computer Science 2022-05-04 Zhikang T. Wang , Masahito Ueda

M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network

Deep Q-learning Network (DQN) is a successful way which combines reinforcement learning with deep neural networks and leads to a widespread application of reinforcement learning. One challenging problem when applying DQN or other…

Machine Learning · Computer Science 2022-09-19 Zhe Zhang , Yukun Zou , Junjie Lai , Qing Xu

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use…

Machine Learning · Computer Science 2019-12-02 Wenjia Meng , Qian Zheng , Long Yang , Pengfei Li , Gang Pan

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role…

Machine Learning · Computer Science 2016-01-21 Vincent François-Lavet , Raphael Fonteneau , Damien Ernst

DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay

We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic experimentation, we evaluate how variations in…

Machine Learning · Computer Science 2025-11-06 Daniel Perkins , Oscar J. Escobar , Luke Green

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning…

Machine Learning · Computer Science 2020-03-05 Pan Xu , Quanquan Gu

$\beta$-DQN: Improving Deep Q-Learning By Evolving the Behavior

While many sophisticated exploration methods have been proposed, their lack of generality and high computational cost often lead researchers to favor simpler methods like $\epsilon$-greedy. Motivated by this, we introduce $\beta$-DQN, a…

Machine Learning · Computer Science 2025-10-29 Hongming Zhang , Fengshuo Bai , Chenjun Xiao , Chao Gao , Bo Xu , Martin Müller

An Experimental Comparison Between Temporal Difference and Residual Gradient with Neural Network Approximation

Gradient descent or its variants are popular in training neural networks. However, in deep Q-learning with neural network approximation, a type of reinforcement learning, gradient descent (also known as Residual Gradient (RG)) is barely…

Machine Learning · Computer Science 2022-11-15 Shuyu Yin , Tao Luo , Peilin Liu , Zhi-Qin John Xu

Deep Reinforcement Learning with Double Q-learning

The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can…

Machine Learning · Computer Science 2015-12-10 Hado van Hasselt , Arthur Guez , David Silver

Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima

Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning. However, due to the nonlinearity in value function approximation, such a coupling leads to…

Machine Learning · Computer Science 2020-04-16 Qi Cai , Zhuoran Yang , Jason D. Lee , Zhaoran Wang

ConQUR: Mitigating Delusional Bias in Deep Q-learning

Delusional bias is a fundamental source of error in approximate Q-learning. To date, the only techniques that explicitly address delusion require comprehensive search using tabular value estimates. In this paper, we develop efficient…

Machine Learning · Computer Science 2020-03-02 Andy Su , Jayden Ooi , Tyler Lu , Dale Schuurmans , Craig Boutilier

Confounding Robust Deep Reinforcement Learning: A Causal Approach

A key task in Artificial Intelligence is learning effective policies for controlling agents in unknown environments to optimize performance measures. Off-policy learning methods, like Q-learning, allow learners to make optimal decisions…

Artificial Intelligence · Computer Science 2025-10-27 Mingxuan Li , Junzhe Zhang , Elias Bareinboim

Self-correcting Q-Learning

The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an…

Machine Learning · Computer Science 2021-02-03 Rong Zhu , Mattia Rigotti

Characterizing the Action-Generalization Gap in Deep Q-Learning

We study the action generalization ability of deep Q-learning in discrete action spaces. Generalization is crucial for efficient reinforcement learning (RL) because it allows agents to use knowledge learned from past experiences on new…

Artificial Intelligence · Computer Science 2022-05-12 Zhiyuan Zhou , Cameron Allen , Kavosh Asadi , George Konidaris