Related papers: Sample Complexity of Kernel-Based Q-Learning

Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning

Reinforcement Learning (RL) problems are being considered under increasingly more complex structures. While tabular and linear models have been thoroughly explored, the analytical study of RL under nonlinear function approximation,…

Machine Learning · Computer Science 2025-09-12 Aya Kayal , Sattar Vakili , Laura Toni , Alberto Bernacchia

A Finite Sample Complexity Bound for Distributionally Robust Q-learning

We consider a reinforcement learning setting in which the deployment environment is different from the training environment. Applying a robust Markov decision processes formulation, we extend the distributionally robust $Q$-learning…

Machine Learning · Computer Science 2024-08-02 Shengbo Wang , Nian Si , Jose Blanchet , Zhengyuan Zhou

An $L^2$ Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation

Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives…

Machine Learning · Computer Science 2022-02-17 Jihao Long , Jiequn Han , Weinan E

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of learning intelligent behavior in rich domains. However, this has largely been done in simulated domains without adequate focus on the process of building the simulator. In…

Machine Learning · Computer Science 2019-10-24 Aditya Modi , Nan Jiang , Ambuj Tewari , Satinder Singh

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces under a generative model. If $Q$-function is Lipschitz continuous, then the minimal sample…

Machine Learning · Computer Science 2020-06-12 Devavrat Shah , Dogyoon Song , Zhi Xu , Yuzhe Yang

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular setting where the state space $\mathcal{S}$ and the action space $\mathcal{A}$ are both finite, to obtain a nearly optimal policy with…

Machine Learning · Computer Science 2022-10-28 Bingyan Wang , Yuling Yan , Jianqing Fan

Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization

Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments. The original Q-learning suffers from performance and complexity challenges across very large networks. Herein,…

Machine Learning · Computer Science 2024-09-02 Talha Bozkus , Urbashi Mitra

Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning

Recent advances in large language models (LLMs) have increasingly relied on reinforcement learning (RL) to improve their reasoning capabilities. Three types of approaches have been widely adopted: The first relies on a deep neural network…

Machine Learning · Computer Science 2026-05-19 Shijin Gong , Kai Ye , Jin Zhu , Xinyu Zhang , Hongyi Zhou , Chengchun Shi

Q-Measure-Learning for Continuous State RL: Efficient Implementation and Convergence

We study reinforcement learning in infinite-horizon discounted Markov decision processes with continuous state spaces, where data are generated online from a single trajectory under a Markovian behavior policy. To avoid maintaining an…

Machine Learning · Computer Science 2026-03-05 Shengbo Wang

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with value-based linear representation, which postulates the…

Machine Learning · Computer Science 2021-10-19 Gen Li , Yuxin Chen , Yuejie Chi , Yuantao Gu , Yuting Wei

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

In reward-free reinforcement learning (RL), an agent explores the environment first without any reward information, in order to achieve certain learning goals afterwards for any given reward. In this paper we focus on reward-free RL under…

Machine Learning · Computer Science 2023-03-21 Yuan Cheng , Ruiquan Huang , Jing Yang , Yingbin Liang

Is Q-learning Provably Efficient?

Model-free reinforcement learning (RL) algorithms, such as Q-learning, directly parameterize and update value functions or policies without explicitly modeling the environment. They are typically simpler, more flexible to use, and thus more…

Machine Learning · Computer Science 2018-07-11 Chi Jin , Zeyuan Allen-Zhu , Sebastien Bubeck , Michael I. Jordan

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Model-based Reinforcement Learning (RL) is a popular learning paradigm due to its potential sample efficiency compared to model-free RL. However, existing empirical model-based RL approaches lack the ability to explore. This work studies a…

Machine Learning · Computer Science 2021-07-16 Yuda Song , Wen Sun

Sample Complexity of Variance-reduced Distributionally Robust Q-learning

Dynamic decision-making under distributional shifts is of fundamental interest in theory and applications of reinforcement learning: The distribution of the environment in which the data is collected can differ from that of the environment…

Machine Learning · Computer Science 2024-09-05 Shengbo Wang , Nian Si , Jose Blanchet , Zhengyuan Zhou

Ranking Policy Gradient

Sample inefficiency is a long-lasting problem in reinforcement learning (RL). The state-of-the-art estimates the optimal action values while it usually involves an extensive search over the state-action space and unstable optimization.…

Machine Learning · Computer Science 2019-11-27 Kaixiang Lin , Jiayu Zhou

On the Sample Complexity of Reinforcement Learning with Policy Space Generalization

We study the optimal sample complexity in large-scale Reinforcement Learning (RL) problems with policy space generalization, i.e. the agent has a prior knowledge that the optimal policy lies in a known policy space. Existing results show…

Machine Learning · Computer Science 2020-08-18 Wenlong Mou , Zheng Wen , Xi Chen

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an $\epsilon$-optimal policy is $\tilde{\Omega}\left(|S||A|H^3 /…

Machine Learning · Computer Science 2023-06-12 Tyler Sam , Yudong Chen , Christina Lee Yu

A Nonparametric Off-Policy Policy Gradient

Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes. The need for intensive interactions with the environment is especially observed in many widely popular policy gradient…

Machine Learning · Computer Science 2020-08-04 Samuele Tosatto , Joao Carvalho , Hany Abdulsamad , Jan Peters

Q-learning with Nearest Neighbors

We consider model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, when only a single sample path under an arbitrary policy of the…

Machine Learning · Computer Science 2018-10-24 Devavrat Shah , Qiaomin Xie

A study on a Q-Learning algorithm application to a manufacturing assembly problem

The development of machine learning algorithms has been gathering relevance to address the increasing modelling complexity of manufacturing decision-making problems. Reinforcement learning is a methodology with great potential due to the…

Machine Learning · Computer Science 2023-04-18 Miguel Neves , Miguel Vieira , Pedro Neto