English
Related papers

Related papers: Online Sub-Sampling for Reinforcement Learning wit…

200 papers

In many real-world settings, reinforcement learning systems suffer performance degradation when the environment encountered at deployment differs from that observed during training. Distributionally robust reinforcement learning (DR-RL)…

Machine Learning · Computer Science 2026-03-05 Debamita Ghosh , George K. Atia , Yue Wang

This paper investigates a hybrid learning framework for reinforcement learning (RL) in which the agent can leverage both an offline dataset and online interactions to learn the optimal policy. We present a unified algorithm and analysis and…

Machine Learning · Computer Science 2025-07-01 Ruiquan Huang , Donghao Li , Chengshuai Shi , Cong Shen , Jing Yang

We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "any" online algorithm with sublinear regret can generate policies with provable performance…

Machine Learning · Computer Science 2020-01-03 Ching-An Cheng , Remi Tachet des Combes , Byron Boots , Geoff Gordon

Deep reinforcement learning has achieved impressive successes yet often requires a very large amount of interaction data. This result is perhaps unsurprising, as using complicated function approximation often requires more data to fit, and…

Machine Learning · Computer Science 2020-11-20 Jonathan N. Lee , Aldo Pacchiano , Vidya Muthukumar , Weihao Kong , Emma Brunskill

Reinforcement learning (RL) in the real world necessitates the development of procedures that enable agents to explore without causing harm to themselves or others. The most successful solutions to the problem of safe RL leverage offline…

Machine Learning · Computer Science 2025-01-09 Alexander Quessy , Thomas Richardson , Sebastian East

Reinforcement learning (RL) has emerged as a promising strategy for finetuning small language models (SLMs) to solve targeted tasks such as math and coding. However, RL algorithms tend to be resource-intensive, taking a significant amount…

Machine Learning · Computer Science 2025-10-07 Lianghuan Huang , Sagnik Anupam , Insup Lee , Shuo Li , Osbert Bastani

A central issue lying at the heart of online reinforcement learning (RL) is data efficiency. While a number of recent works achieved asymptotically minimal regret in online RL, the optimality of these results is only guaranteed in a…

Machine Learning · Computer Science 2025-04-30 Zihan Zhang , Yuxin Chen , Jason D. Lee , Simon S. Du

We consider the offline reinforcement learning problem, where the aim is to learn a decision making policy from logged data. Offline RL -- particularly when coupled with (value) function approximation to allow for generalization in large or…

Machine Learning · Computer Science 2022-08-31 Dylan J. Foster , Akshay Krishnamurthy , David Simchi-Levi , Yunzong Xu

We study the sample complexity of online reinforcement learning in the general \hzyrev{non-episodic} setting of nonlinear dynamical systems with continuous state and action spaces. Our analysis accommodates a large class of dynamical…

Machine Learning · Computer Science 2026-03-02 Michael Muehlebach , Zhiyu He , Michael I. Jordan

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we…

Machine Learning · Computer Science 2023-03-31 Yicheng Luo , Jackie Kay , Edward Grefenstette , Marc Peter Deisenroth

We consider reinforcement learning (RL) methods in offline domains without additional online data collection, such as mobile health applications. Most of existing policy optimization algorithms in the computer science literature are…

Machine Learning · Statistics 2022-07-28 Chengchun Shi , Shikai Luo , Yuan Le , Hongtu Zhu , Rui Song

Sample efficiency and exploration remain major challenges in online reinforcement learning (RL). A powerful approach that can be applied to address these issues is the inclusion of offline data, such as prior trajectories from a human…

Machine Learning · Computer Science 2023-06-01 Philip J. Ball , Laura Smith , Ilya Kostrikov , Sergey Levine

Online reinforcement learning (RL) excels in complex, safety-critical domains but suffers from sample inefficiency, training instability, and limited interpretability. Data attribution provides a principled way to trace model behavior back…

Machine Learning · Computer Science 2025-10-07 Yuzheng Hu , Fan Wu , Haotian Ye , David Forsyth , James Zou , Nan Jiang , Jiaqi W. Ma , Han Zhao

Value function approximation is important in modern reinforcement learning (RL) problems especially when the state space is (infinitely) large. Despite the importance and wide applicability of value function approximation, its theoretical…

Machine Learning · Computer Science 2023-02-24 Hanlin Zhu , Ruosong Wang , Jason D. Lee

Offline reinforcement learning (RL) enables policy learning from static data but often suffers from poor coverage of the state-action space and distributional shift problems. This problem can be addressed by allowing limited online…

Machine Learning · Computer Science 2026-02-03 Soumyadeep Roy , Shashwat Kushwaha , Ambedkar Dukkipati

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or…

Machine Learning · Computer Science 2025-05-20 Jiashuo Jiang , Yiming Zong , Yinyu Ye

One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems. To…

Machine Learning · Computer Science 2024-06-19 Noah Golowich , Ankur Moitra

Leveraging offline data is a promising way to improve the sample efficiency of online reinforcement learning (RL). This paper expands the pool of usable data for offline-to-online RL by leveraging abundant non-curated data that is…

Machine Learning · Computer Science 2025-05-20 Yi Zhao , Aidan Scannell , Wenshuai Zhao , Yuxin Hou , Tianyu Cui , Le Chen , Dieter Büchler , Arno Solin , Juho Kannala , Joni Pajarinen

Offline reinforcement learning (RL) aims to learn a policy that maximizes the expected return using a given static dataset of transitions. However, offline RL faces the distribution shift problem. The policy constraint offline RL method is…

Machine Learning · Computer Science 2025-12-24 Yuanhao Chen , Qi Liu , Pengbin Chen , Zhongjian Qiao , Yanjie Li
‹ Prev 1 2 3 10 Next ›