English
Related papers

Related papers: When does return-conditioned supervised learning w…

200 papers

In sequential decision-making problems, Return-Conditioned Supervised Learning (RCSL) has gained increasing recognition for its simplicity and stability in modern decision-making tasks. Unlike traditional offline reinforcement learning (RL)…

Machine Learning · Computer Science 2025-06-11 Zhishuai Liu , Yu Yang , Ruhan Wang , Pan Xu , Dongruo Zhou

Offline goal-conditioned reinforcement learning (GCRL) aims at solving goal-reaching tasks with sparse rewards from an offline dataset. While prior work has demonstrated various approaches for agents to learn near-optimal policies, these…

Robotics · Computer Science 2024-03-05 Chenyang Cao , Zichen Yan , Renhao Lu , Junbo Tan , Xueqian Wang

Offline reinforcement learning (RL) aims to optimize the return given a fixed dataset of agent trajectories without additional interactions with the environment. While algorithm development has progressed rapidly, significant theoretical…

Machine Learning · Computer Science 2025-08-12 Fengdi Che

The paradigm of decision-making has been revolutionised by reinforcement learning and deep learning. Although this has led to significant progress in domains such as robotics, healthcare, and finance, the use of RL in practice is…

Machine Learning · Computer Science 2026-02-23 Daqian Shao

Traditional offline reinforcement learning (RL) methods predominantly operate in a batch-constrained setting. This confines the algorithms to a specific state-action distribution present in the dataset, reducing the effects of…

Machine Learning · Statistics 2025-07-16 Charles A. Hepburn , Yue Jin , Giovanni Montana

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL. When does this hold true, and which algorithmic components are necessary? Through extensive…

Machine Learning · Computer Science 2022-05-12 Scott Emmons , Benjamin Eysenbach , Ilya Kostrikov , Sergey Levine

Offline reinforcement learning (RL) aims to learn a policy that maximizes the expected return using a given static dataset of transitions. However, offline RL faces the distribution shift problem. The policy constraint offline RL method is…

Machine Learning · Computer Science 2025-12-24 Yuanhao Chen , Qi Liu , Pengbin Chen , Zhongjian Qiao , Yanjie Li

Distributionally robust offline reinforcement learning (RL) aims to find a policy that performs the best under the worst environment within an uncertainty set using an offline dataset collected from a nominal model. While recent advances in…

Machine Learning · Computer Science 2025-01-07 Ruiquan Huang , Yingbin Liang , Jing Yang

Offline Reinforcement Learning (RL) is structured to derive policies from static trajectory data without requiring real-time environment interactions. Recent studies have shown the feasibility of framing offline RL as a sequence modeling…

Machine Learning · Computer Science 2023-09-01 Abdelghani Ghanem , Philippe Ciblat , Mounir Ghogho

Sequential decision making algorithms often struggle to leverage different sources of unstructured offline interaction data. Imitation learning (IL) methods based on supervised learning are robust, but require optimal demonstrations, which…

Machine Learning · Computer Science 2023-04-28 Joey Hejna , Jensen Gao , Dorsa Sadigh

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

The recent success of supervised learning methods on ever larger offline datasets has spurred interest in the reinforcement learning (RL) field to investigate whether the same paradigms can be translated to RL algorithms. This research…

Machine Learning · Computer Science 2021-02-12 Mengjiao Yang , Ofir Nachum

A promising paradigm for offline reinforcement learning (RL) is to constrain the learned policy to stay close to the dataset behaviors, known as policy constraint offline RL. However, existing works heavily rely on the purity of the data,…

Machine Learning · Computer Science 2022-10-20 Chengqian Gao , Ke Xu , Liu Liu , Deheng Ye , Peilin Zhao , Zhiqiang Xu

We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or…

Machine Learning · Computer Science 2023-05-25 Chenjie Mao

Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation…

Machine Learning · Computer Science 2020-10-23 Ruosong Wang , Dean P. Foster , Sham M. Kakade

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining…

Machine Learning · Computer Science 2023-04-20 Rafael Figueiredo Prudencio , Marcos R. O. A. Maximo , Esther Luna Colombini

Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is impractical when online interactions are costly. Offline RL provides an alternative solution by directly learning from the previously collected…

Machine Learning · Computer Science 2023-03-15 Han Zheng , Xufang Luo , Pengfei Wei , Xuan Song , Dongsheng Li , Jing Jiang

We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset. This…

Machine Learning · Computer Science 2022-04-20 Jongmin Lee , Cosmin Paduraru , Daniel J. Mankowitz , Nicolas Heess , Doina Precup , Kee-Eung Kim , Arthur Guez

Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment. We aim to tackle a more challenging problem: learning a safe policy from an offline dataset. We study the offline safe RL problem…

Machine Learning · Computer Science 2023-06-22 Zuxin Liu , Zijian Guo , Yihang Yao , Zhepeng Cen , Wenhao Yu , Tingnan Zhang , Ding Zhao

In safe offline reinforcement learning (RL), the objective is to develop a policy that maximizes cumulative rewards while strictly adhering to safety constraints, utilizing only offline data. Traditional methods often face difficulties in…

Machine Learning · Computer Science 2026-02-11 Prajwal Koirala , Zhanhong Jiang , Soumik Sarkar , Cody Fleming
‹ Prev 1 2 3 10 Next ›