Low-rank State-action Value-function Approximation

Sergio Rozada; Victor Tenorio; Antonio G. Marques

Low-rank State-action Value-function Approximation

Artificial Intelligence 2021-04-20 v1

Authors: Sergio Rozada , Victor Tenorio , Antonio G. Marques

Abstract

Value functions are central to Dynamic Programming and Reinforcement Learning but their exact estimation suffers from the curse of dimensionality, challenging the development of practical value-function (VF) estimation algorithms. Several approaches have been proposed to overcome this issue, from non-parametric schemes that aggregate states or actions to parametric approximations of state and action VFs via, e.g., linear estimators or deep neural networks. Relevantly, several high-dimensional state problems can be well-approximated by an intrinsic low-rank structure. Motivated by this and leveraging results from low-rank optimization, this paper proposes different stochastic algorithms to estimate a low-rank factorization of the $Q(s, a)$ matrix. This is a non-parametric alternative to VF approximation that dramatically reduces the computational and sample complexities relative to classical $Q$ -learning methods that estimate $Q(s,a)$ separately for each state-action pair.

Keywords

nonnegative matrix factorization matrix factorization reinforcement learning

Cite

@article{arxiv.2104.08805,
  title  = {Low-rank State-action Value-function Approximation},
  author = {Sergio Rozada and Victor Tenorio and Antonio G. Marques},
  journal= {arXiv preprint arXiv:2104.08805},
  year   = {2021}
}

Low-rank State-action Value-function Approximation

Abstract

Keywords

Cite

Related papers