Related papers: Zap Q-Learning With Nonlinear Function Approximati…

Stability of Q-Learning Through Design and Optimism

Q-learning has become an important part of the reinforcement learning toolkit since its introduction in the dissertation of Chris Watkins in the 1980s. The purpose of this paper is in part a tutorial on stochastic approximation and…

Machine Learning · Computer Science 2023-08-22 Sean Meyn

Zap Q-Learning for Optimal Stopping Time Problems

The objective in this paper is to obtain fast converging reinforcement learning algorithms to approximate solutions to the problem of discounted cost optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on a compact…

Systems and Control · Computer Science 2019-10-01 Shuhang Chen , Adithya M. Devraj , Ana Bušić , Sean P. Meyn

Regularized Q-learning

Q-learning is widely used algorithm in reinforcement learning community. Under the lookup table setting, its convergence is well established. However, its behavior is known to be unstable with the linear function approximation case. This…

Machine Learning · Computer Science 2025-02-11 Han-Dong Lim , Donghwan Lee

Fastest Convergence for Q-learning

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its asymptotic variance is optimal. Moreover,…

Systems and Control · Computer Science 2018-03-23 Adithya M. Devraj , Sean P. Meyn

Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning

The $Q$-learning algorithm is a simple and widely-used stochastic approximation scheme for reinforcement learning, but the basic protocol can exhibit instability in conjunction with function approximation. Such instability can be observed…

Machine Learning · Computer Science 2022-06-03 Andrea Zanette , Martin J. Wainwright

Periodic Regularized Q-Learning

In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this…

Machine Learning · Computer Science 2026-02-04 Hyukjun Yang , Han-Dong Lim , Donghwan Lee

Diagnosing Bottlenecks in Deep Q-learning Algorithms

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL). However, the…

Machine Learning · Computer Science 2019-02-28 Justin Fu , Aviral Kumar , Matthew Soh , Sergey Levine

An Elementary Proof that Q-learning Converges Almost Surely

Watkins' and Dayan's Q-learning is a model-free reinforcement learning algorithm that iteratively refines an estimate for the optimal action-value function of an MDP by stochastically "visiting" many state-ation pairs [Watkins and Dayan,…

Machine Learning · Computer Science 2021-08-09 Matthew T. Regehr , Alex Ayoub

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning…

Machine Learning · Computer Science 2020-03-05 Pan Xu , Quanquan Gu

Two-Step Q-Learning

Q-learning is a stochastic approximation version of the classic value iteration. The literature has established that Q-learning suffers from both maximization bias and slower convergence. Recently, multi-step algorithms have shown practical…

Machine Learning · Computer Science 2024-07-03 Antony Vijesh , Shreyas S R

Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions,…

Machine Learning · Computer Science 2021-04-13 Arunselvan Ramaswamy , Eyke Hüllermeier

An Output Feedback Q-learning Algorithm for Optimal Control of Nonlinear Systems with Koopman Linear Embedding

In the reinforcement learning literature, strong theoretical guarantees have been obtained for algorithms applicable to LTI systems. However, in the nonlinear case only weaker results have been obtained for algorithms that mostly rely on…

Systems and Control · Electrical Eng. & Systems 2026-04-01 Victor G. Lopez , Malte Heinrich , Matthias A. Müller

Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement Learning

Motivated by applications in reinforcement learning (RL), we study a nonlinear stochastic approximation (SA) algorithm under Markovian noise, and establish its finite-sample convergence bounds under various stepsizes. Specifically, we show…

Optimization and Control · Mathematics 2022-01-27 Zaiwei Chen , Sheng Zhang , Thinh T. Doan , John-Paul Clarke , Siva Theja Maguluri

Target Network and Truncation Overcome The Deadly Triad in $Q$-Learning

$Q$-learning with function approximation is one of the most empirically successful while theoretically mysterious reinforcement learning (RL) algorithms, and was identified in Sutton (1999) as one of the most important theoretical open…

Machine Learning · Computer Science 2022-05-04 Zaiwei Chen , John Paul Clarke , Siva Theja Maguluri

Convergence and stability of Q-learning in Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning promises, among other benefits, to efficiently capture and utilize the temporal structure of a decision-making problem and to enhance continual learning capabilities, but theoretical guarantees lag behind…

Machine Learning · Computer Science 2025-11-24 Massimiliano Manenti , Andrea Iannelli

Towards Characterizing Divergence in Deep Q-Learning

Deep Q-Learning (DQL), a family of temporal difference algorithms for control, employs three techniques collectively known as the `deadly triad' in reinforcement learning: bootstrapping, off-policy learning, and function approximation.…

Machine Learning · Computer Science 2019-03-22 Joshua Achiam , Ethan Knight , Pieter Abbeel

Stochastic Training of Neural Networks via Successive Convex Approximations

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA)…

Machine Learning · Statistics 2017-06-16 Simone Scardapane , Paolo Di Lorenzo

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Offline reinforcement learning, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications. State-Of-The-Art algorithms usually leverage powerful function…

Machine Learning · Computer Science 2022-11-28 Ming Yin , Mengdi Wang , Yu-Xiang Wang

Reinforcement Learning by Comparing Immediate Reward

This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate rewards using a variation of Q-Learning algorithm. Unlike the conventional Q-Learning, the proposed algorithm compares current reward with…

Machine Learning · Computer Science 2010-09-15 Punit Pandey , Deepshikha Pandey , Shishir Kumar

Q-learning with online random forests

$Q$-learning is the most fundamental model-free reinforcement learning algorithm. Deployment of $Q$-learning requires approximation of the state-action value function (also known as the $Q$-function). In this work, we provide online random…

Machine Learning · Statistics 2022-04-11 Joosung Min , Lloyd T. Elliott