English

Adaptive Resolving Methods for Reinforcement Learning with Function Approximations

Machine Learning 2025-05-20 v1

Abstract

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or infinite state-action space. In our work, we consider the RL problems with function approximation and we develop a new algorithm to solve it efficiently. Our algorithm is based on the linear programming (LP) reformulation and it resolves the LP at each iteration improved with new data arrival. Such a resolving scheme enables our algorithm to achieve an instance-dependent sample complexity guarantee, more precisely, when we have NN data, the output of our algorithm enjoys an instance-dependent O~(1/N)\tilde{O}(1/N) suboptimality gap. In comparison to the O(1/N)O(1/\sqrt{N}) worst-case guarantee established in the previous literature, our instance-dependent guarantee is tighter when the underlying instance is favorable, and the numerical experiments also reveal the efficient empirical performances of our algorithms.

Keywords

Cite

@article{arxiv.2505.12037,
  title  = {Adaptive Resolving Methods for Reinforcement Learning with Function Approximations},
  author = {Jiashuo Jiang and Yiming Zong and Yinyu Ye},
  journal= {arXiv preprint arXiv:2505.12037},
  year   = {2025}
}