Adaptive Resolving Methods for Reinforcement Learning with Function Approximations

Jiashuo Jiang; Yiming Zong; Yinyu Ye

Adaptive Resolving Methods for Reinforcement Learning with Function Approximations

Machine Learning 2025-05-20 v1

Authors: Jiashuo Jiang , Yiming Zong , Yinyu Ye

Abstract

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or infinite state-action space. In our work, we consider the RL problems with function approximation and we develop a new algorithm to solve it efficiently. Our algorithm is based on the linear programming (LP) reformulation and it resolves the LP at each iteration improved with new data arrival. Such a resolving scheme enables our algorithm to achieve an instance-dependent sample complexity guarantee, more precisely, when we have $N$ data, the output of our algorithm enjoys an instance-dependent $\tilde{O}(1/N)$ suboptimality gap. In comparison to the $O(1/\sqrt{N})$ worst-case guarantee established in the previous literature, our instance-dependent guarantee is tighter when the underlying instance is favorable, and the numerical experiments also reveal the efficient empirical performances of our algorithms.

Keywords

reinforcement learning policy gradient reinforcement learning from human feedback

Cite

@article{arxiv.2505.12037,
  title  = {Adaptive Resolving Methods for Reinforcement Learning with Function Approximations},
  author = {Jiashuo Jiang and Yiming Zong and Yinyu Ye},
  journal= {arXiv preprint arXiv:2505.12037},
  year   = {2025}
}

Adaptive Resolving Methods for Reinforcement Learning with Function Approximations

Abstract

Keywords

Cite

Related papers