Related papers: Offline Reinforcement Learning with Pseudometric L…

Offline Reinforcement Learning as Anti-Exploration

Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, without interactions with the system. An agent in this setting should avoid selecting actions whose consequences cannot be predicted from the…

Machine Learning · Computer Science 2021-06-14 Shideh Rezaeifar , Robert Dadashi , Nino Vieillard , Léonard Hussenot , Olivier Bachem , Olivier Pietquin , Matthieu Geist

Offline Reinforcement Learning with Imputed Rewards

Offline Reinforcement Learning (ORL) offers a robust solution to training agents in applications where interactions with the environment must be strictly limited due to cost, safety, or lack of accurate simulation environments. Despite its…

Machine Learning · Computer Science 2024-07-16 Carlo Romeo , Andrew D. Bagdanov

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors, inducing confounding and biasing estimates…

Machine Learning · Computer Science 2023-03-24 Andrew Bennett , Nathan Kallus

Towards Data-Driven Offline Simulations for Online Reinforcement Learning

Modern decision-making systems, from robots to web recommendation engines, are expected to adapt: to user preferences, changing circumstances or even new tasks. Yet, it is still uncommon to deploy a dynamically learning agent (rather than a…

Machine Learning · Computer Science 2022-11-15 Shengpu Tang , Felipe Vieira Frujeri , Dipendra Misra , Alex Lamb , John Langford , Paul Mineiro , Sebastian Kochman

Offline Reinforcement Learning from Images with Latent Space Models

Offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions. Offline RL enables extensive use and re-use of historical datasets, while also alleviating safety concerns…

Machine Learning · Computer Science 2020-12-22 Rafael Rafailov , Tianhe Yu , Aravind Rajeswaran , Chelsea Finn

Behavior Prior Representation learning for Offline Reinforcement Learning

Offline reinforcement learning (RL) struggles in environments with rich and noisy inputs, where the agent only has access to a fixed dataset without environment interactions. Past works have proposed common workarounds based on the…

Machine Learning · Computer Science 2023-03-01 Hongyu Zang , Xin Li , Jie Yu , Chen Liu , Riashat Islam , Remi Tachet Des Combes , Romain Laroche

Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization

Offline reinforcement learning (RL) is a variant of RL where the policy is learned from a previously collected dataset of trajectories and rewards. In our work, we propose a practical approach to offline RL with large language models…

Computation and Language · Computer Science 2026-02-17 Subhojyoti Mukherjee , Viet Dac Lai , Raghavendra Addanki , Ryan Rossi , Seunghyun Yoon , Trung Bui , Anup Rao , Jayakumar Subramanian , Branislav Kveton

Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees

This article introduces the theory of offline reinforcement learning in large state spaces, where good policies are learned from historical data without online interactions with the environment. Key concepts introduced include expressivity…

Machine Learning · Computer Science 2025-10-07 Nan Jiang , Tengyang Xie

Learning Control Policies for Variable Objectives from Offline Data

Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available. In this paper, we introduce a conceptual…

Machine Learning · Computer Science 2024-01-04 Marc Weber , Phillip Swazinna , Daniel Hein , Steffen Udluft , Volkmar Sterzing

Grid-Mapping Pseudo-Count Constraint for Offline Reinforcement Learning

Offline reinforcement learning learns from a static dataset without interacting with environments, which ensures security and thus owns a good application prospect. However, directly applying naive reinforcement learning algorithm usually…

Machine Learning · Computer Science 2024-11-08 Yi Shen , Hanyan Huang

Offline Reinforcement Learning with Implicit Q-Learning

Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behavior policy that collected the dataset, while at the same time minimizing the deviation from the behavior policy so as to…

Machine Learning · Computer Science 2021-10-13 Ilya Kostrikov , Ashvin Nair , Sergey Levine

What are the Statistical Limits of Offline RL with Linear Function Approximation?

Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation…

Machine Learning · Computer Science 2020-10-23 Ruosong Wang , Dean P. Foster , Sham M. Kakade

A Behavior Regularized Implicit Policy for Offline Reinforcement Learning

Offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment. The lack of environmental interactions makes the policy training vulnerable to state-action pairs far from the training…

Machine Learning · Statistics 2022-10-11 Shentao Yang , Zhendong Wang , Huangjie Zheng , Yihao Feng , Mingyuan Zhou

PLAS: Latent Action Space for Offline Reinforcement Learning

The goal of offline reinforcement learning is to learn a policy from a fixed dataset, without further interactions with the environment. This setting will be an increasingly more important paradigm for real-world applications of…

Robotics · Computer Science 2020-11-17 Wenxuan Zhou , Sujay Bajracharya , David Held

Evaluation-Time Policy Switching for Offline Reinforcement Learning

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Offline Primal-Dual Reinforcement Learning for Linear MDPs

Offline Reinforcement Learning (RL) aims to learn a near-optimal policy from a fixed dataset of transitions collected by another policy. This problem has attracted a lot of attention recently, but most existing methods with strong…

Machine Learning · Computer Science 2023-05-23 Germano Gabbianelli , Gergely Neu , Nneka Okolo , Matteo Papini

Real-World Offline Reinforcement Learning from Vision Language Model Feedback

Offline reinforcement learning can enable policy learning from pre-collected, sub-optimal datasets without online interactions. This makes it ideal for real-world robots and safety-critical scenarios, where collecting online data or expert…

Robotics · Computer Science 2025-08-07 Sreyas Venkataraman , Yufei Wang , Ziyu Wang , Navin Sriram Ravie , Zackory Erickson , David Held

Leveraging Fully Observable Policies for Learning under Partial Observability

Reinforcement learning in partially observable domains is challenging due to the lack of observable state information. Thankfully, learning offline in a simulator with such state information is often possible. In particular, we propose a…

Robotics · Computer Science 2022-11-11 Hai Nguyen , Andrea Baisero , Dian Wang , Christopher Amato , Robert Platt

Multi-agent Off-policy Actor-Critic Reinforcement Learning for Partially Observable Environments

This study proposes the use of a social learning method to estimate a global state within a multi-agent off-policy actor-critic algorithm for reinforcement learning (RL) operating in a partially observable environment. We assume that the…

Machine Learning · Computer Science 2024-07-09 Ainur Zhaikhan , Ali H. Sayed

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. This can allow robots to acquire generalizable skills from large and diverse datasets, without any…

Machine Learning · Computer Science 2021-09-24 Aviral Kumar , Anikait Singh , Stephen Tian , Chelsea Finn , Sergey Levine