English

Periodic Regularized Q-Learning

Machine Learning 2026-02-04 v1 Artificial Intelligence

Abstract

In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this limitation, a significant line of research has introduced regularization techniques to ensure stable convergence under function approximation. In this work, we propose a new algorithm, periodic regularized Q-learning (PRQ). We first introduce regularization at the level of the projection operator and explicitly construct a regularized projected value iteration (RP-VI), subsequently extending it to a sample-based RL algorithm. By appropriately regularizing the projection operator, the resulting projected value iteration becomes a contraction. By extending this regularized projection into the stochastic setting, we establish the PRQ algorithm and provide a rigorous theoretical analysis that proves finite-time convergence guarantees for PRQ under linear function approximation.

Keywords

Cite

@article{arxiv.2602.03301,
  title  = {Periodic Regularized Q-Learning},
  author = {Hyukjun Yang and Han-Dong Lim and Donghwan Lee},
  journal= {arXiv preprint arXiv:2602.03301},
  year   = {2026}
}
R2 v1 2026-07-01T09:33:48.725Z