Related papers: Periodic Q-Learning

Finite Horizon Q-learning: Stability, Convergence, Simulations and an application on Smart Grids

Q-learning is a popular reinforcement learning algorithm. This algorithm has however been studied and analysed mainly in the infinite horizon setting. There are several important applications which can be modeled in the framework of finite…

Machine Learning · Computer Science 2022-08-09 Vivek VP , Dr. Shalabh Bhatnagar

Target-Based Temporal Difference Learning

The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side. In this work, we introduce a new family of target-based temporal…

Machine Learning · Computer Science 2019-09-24 Donghwan Lee , Niao He

Periodic Regularized Q-Learning

In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this…

Machine Learning · Computer Science 2026-02-04 Hyukjun Yang , Han-Dong Lim , Donghwan Lee

Deep Q-Learning with Gradient Target Tracking

This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard…

Machine Learning · Computer Science 2025-07-21 Bum Geun Park , Taeho Lee , Donghwan Lee

Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions,…

Machine Learning · Computer Science 2021-04-13 Arunselvan Ramaswamy , Eyke Hüllermeier

Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning

The use of target networks in deep reinforcement learning is a widely popular solution to mitigate the brittleness of semi-gradient approaches and stabilize learning. However, target networks notoriously require additional memory and delay…

Machine Learning · Computer Science 2026-03-02 Théo Vincent , Yogesh Tripathi , Tim Faust , Abdullah Akgül , Yaniv Oren , Melih Kandemir , Jan Peters , Carlo D'Eramo

Penalized Q-Learning for Dynamic Treatment Regimes

A dynamic treatment regime effectively incorporates both accrued information and long-term effects of treatment from specially designed clinical trials. As these become more and more popular in conjunction with longitudinal data from…

Methodology · Statistics 2011-08-29 Rui Song , Weiwei Wang , Donglin Zeng , Michael R. Kosorok

Cross Learning in Deep Q-Networks

In this work, we propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods, particularly in the deep Q-networks where the overestimation is exaggerated…

Artificial Intelligence · Computer Science 2020-09-30 Xing Wang , Alexander Vinel

Why Target Networks Stabilise Temporal Difference Methods

Integral to recent successes in deep reinforcement learning has been a class of temporal difference methods that use infrequently updated target values for policy evaluation in a Markov Decision Process. Yet a complete theoretical…

Machine Learning · Computer Science 2023-08-15 Mattie Fellows , Matthew J. A. Smith , Shimon Whiteson

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in…

Machine Learning · Computer Science 2024-11-01 Jia Lin Hau , Erick Delage , Esther Derman , Mohammad Ghavamzadeh , Marek Petrik

$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of…

Machine Learning · Statistics 2015-06-04 Soummya Kar , Jose' M. F. Moura , H. Vincent Poor

Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims at incorporating semi-supervised learning into reinforcement learning…

Machine Learning · Computer Science 2024-06-26 Li Meng , Anis Yazidi , Morten Goodwin , Paal Engelstad

Deep Reinforcement Learning for Adaptive Learning Systems

In this paper, we formulate the adaptive learning problem---the problem of how to find an individualized learning plan (called policy) that chooses the most appropriate learning materials based on learner's latent traits---faced in adaptive…

Machine Learning · Computer Science 2020-04-21 Xiao Li , Hanchen Xu , Jinming Zhang , Hua-hua Chang

Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization

Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments. The original Q-learning suffers from performance and complexity challenges across very large networks. Herein,…

Machine Learning · Computer Science 2024-09-02 Talha Bozkus , Urbashi Mitra

Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning

In state of the art model-free off-policy deep reinforcement learning, a replay memory is used to store past experience and derive all network updates. Even if both state and action spaces are continuous, the replay memory only holds a…

Machine Learning · Computer Science 2020-07-16 Sabrina Hoppe , Marc Toussaint

Financial Trading as a Game: A Deep Reinforcement Learning Approach

An automatic program that generates constant profit from the financial market is lucrative for every market practitioner. Recent advance in deep reinforcement learning provides a framework toward end-to-end training of such trading agent.…

Trading and Market Microstructure · Quantitative Finance 2018-07-10 Chien Yi Huang

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning…

Machine Learning · Computer Science 2020-03-05 Pan Xu , Quanquan Gu

Self-correcting Q-Learning

The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an…

Machine Learning · Computer Science 2021-02-03 Rong Zhu , Mattia Rigotti

Q-Learning with Differential Entropy of Q-Tables

It is well-known that information loss can occur in the classic and simple Q-learning algorithm. Entropy-based policy search methods were introduced to replace Q-learning and to design algorithms that are more robust against information…

Machine Learning · Computer Science 2020-06-29 Tung D. Nguyen , Kathryn E. Kasmarik , Hussein A. Abbass

Transfer Q-learning

Time-inhomogeneous finite-horizon Markov decision processes (MDP) are frequently employed to model decision-making in dynamic treatment regimes and other statistical reinforcement learning (RL) scenarios. These fields, especially healthcare…

Machine Learning · Computer Science 2025-10-21 Elynn Chen , Sai Li , Michael I. Jordan