English

Provably Efficient Kernelized Q-Learning

Machine Learning 2022-04-25 v1 Machine Learning

Abstract

We propose and analyze a kernelized version of Q-learning. Although a kernel space is typically infinite-dimensional, extensive study has shown that generalization is only affected by the effective dimension of the data. We incorporate such ideas into the Q-learning framework and derive regret bounds for arbitrary kernels. In particular, we provide concrete bounds for linear kernels and Gaussian RBF kernels; notably, the latter bound looks almost identical to the former, only that the actual dimension is replaced by a different notion of dimensionality. Finally, we test our algorithm on a suite of classic control tasks; remarkably, under the Gaussian RBF kernel, it achieves reasonably good performance after only 1000 environmental steps, while its neural network counterpart, deep Q-learning, still struggles.

Keywords

Cite

@article{arxiv.2204.10349,
  title  = {Provably Efficient Kernelized Q-Learning},
  author = {Shuang Liu and Hao Su},
  journal= {arXiv preprint arXiv:2204.10349},
  year   = {2022}
}
R2 v1 2026-06-24T10:55:12.296Z