Related papers: Is Q-learning Provably Efficient?

Efficient Model-free Reinforcement Learning in Metric Spaces

Model-free Reinforcement Learning (RL) algorithms such as Q-learning [Watkins, Dayan 92] have been widely used in practice and can achieve human level performance in applications such as video games [Mnih et al. 15]. Recently, equipped with…

Machine Learning · Computer Science 2019-05-03 Zhao Song , Wen Sun

Stochastic Lipschitz Q-Learning

In an episodic Markov Decision Process (MDP) problem, an online algorithm chooses from a set of actions in a sequence of $H$ trials, where $H$ is the episode length, in order to maximize the total payoff of the chosen actions. Q-learning,…

Machine Learning · Computer Science 2019-07-11 Xu Zhu

Is Q-Learning Provably Efficient? An Extended Analysis

This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical…

Machine Learning · Computer Science 2020-09-23 Kushagra Rastogi , Jonathan Lee , Fabrice Harel-Canada , Aditya Joglekar

Model-Free Linear Quadratic Control via Reduction to Expert Prediction

Model-free approaches for reinforcement learning (RL) and continuous control find policies based only on past states and rewards, without fitting a model of the system dynamics. They are appealing as they are general purpose and easy to…

Machine Learning · Computer Science 2018-10-09 Yasin Abbasi-Yadkori , Nevena Lazic , Csaba Szepesvari

The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning

Some reinforcement learning methods suffer from high sample complexity causing them to not be practical in real-world situations. $Q$-function reuse, a transfer learning method, is one way to reduce the sample complexity of learning,…

Machine Learning · Computer Science 2021-03-09 Volodymyr Tkachuk , Sriram Ganapathi Subramanian , Matthew E. Taylor

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic Markov decision process with $S$ states, $A$ actions and…

Machine Learning · Computer Science 2022-10-18 Gen Li , Laixi Shi , Yuxin Chen , Yuejie Chi

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP

A fundamental question in reinforcement learning is whether model-free algorithms are sample efficient. Recently, Jin et al. \cite{jin2018q} proposed a Q-learning algorithm with UCB exploration policy, and proved it has nearly optimal…

Machine Learning · Computer Science 2019-09-30 Kefan Dong , Yuanhao Wang , Xiaoyu Chen , Liwei Wang

Sharper Model-free Reinforcement Learning for Average-reward Markov Decision Processes

We develop several provably efficient model-free reinforcement learning (RL) algorithms for infinite-horizon average-reward Markov Decision Processes (MDPs). We consider both online setting and the setting with access to a simulator. In the…

Machine Learning · Computer Science 2023-06-29 Zihan Zhang , Qiaomin Xie

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$. We propose a model-free algorithm UCB-Advantage and prove that it…

Machine Learning · Computer Science 2020-06-09 Zihan Zhang , Yuan Zhou , Xiangyang Ji

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

We study the constrained reinforcement learning problem, in which an agent aims to maximize the expected cumulative reward subject to a constraint on the expected total value of a utility function. In contrast to existing model-based…

Machine Learning · Computer Science 2023-01-10 Arnob Ghosh , Xingyu Zhou , Ness Shroff

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world problems, even with off-policy algorithms such…

Machine Learning · Computer Science 2020-02-25 Vitchyr Pong , Shixiang Gu , Murtaza Dalal , Sergey Levine

Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time

A crucial problem in reinforcement learning is learning the optimal policy. We study this in tabular infinite-horizon discounted Markov decision processes under the online setting. The existing algorithms either fail to achieve regret…

Machine Learning · Computer Science 2023-12-13 Xiang Ji , Gen Li

Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control

We consider model-free reinforcement learning (RL) in non-stationary Markov decision processes. Both the reward functions and the state transition functions are allowed to vary arbitrarily over time as long as their cumulative variations do…

Machine Learning · Computer Science 2022-08-23 Weichao Mao , Kaiqing Zhang , Ruihao Zhu , David Simchi-Levi , Tamer Başar

Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems

We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions, where states are scalar-valued and running control rewards are absent but volatilities of the state processes depend…

Machine Learning · Computer Science 2025-07-25 Yilie Huang , Yanwei Jia , Xun Yu Zhou

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with value-based linear representation, which postulates the…

Machine Learning · Computer Science 2021-10-19 Gen Li , Yuxin Chen , Yuejie Chi , Yuantao Gu , Yuting Wei

Online Robust Reinforcement Learning with Model Uncertainty

Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a…

Machine Learning · Computer Science 2021-10-29 Yue Wang , Shaofeng Zou

Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial

In this paper, a review of model-free reinforcement learning for learning of dynamical systems in uncertain environments has discussed. For this purpose, the Markov Decision Process (MDP) will be reviewed. Furthermore, some learning…

Machine Learning · Computer Science 2019-05-21 Mehran Attar , Mohammadreza Dabirian

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Model-based Reinforcement Learning (RL) is a popular learning paradigm due to its potential sample efficiency compared to model-free RL. However, existing empirical model-based RL approaches lack the ability to explore. This work studies a…

Machine Learning · Computer Science 2021-07-16 Yuda Song , Wen Sun

Provably Efficient and Agile Randomized Q-Learning

While Bayesian-based exploration often demonstrates superior empirical performance compared to bonus-based methods in model-based reinforcement learning (RL), its theoretical understanding remains limited for model-free settings. Existing…

Machine Learning · Computer Science 2026-02-05 He Wang , Xingyu Xu , Yuejie Chi

Q-learning-based Model-free Safety Filter

Ensuring safety via safety filters in real-world robotics presents significant challenges, particularly when the system dynamics is complex or unavailable. To handle this issue, learning-based safety filters recently gained popularity,…

Robotics · Computer Science 2024-12-02 Guo Ning Sue , Yogita Choudhary , Richard Desatnik , Carmel Majidi , John Dolan , Guanya Shi