Related papers: MOPO: Model-based Offline Policy Optimization

Model-Based Offline Meta-Reinforcement Learning with Regularization

Existing offline reinforcement learning (RL) methods face a few major challenges, particularly the distributional shift between the learned policy and the behavior policy. Offline Meta-RL is emerging as a promising approach to address these…

Machine Learning · Computer Science 2022-07-14 Sen Lin , Jialin Wan , Tengyu Xu , Yingbin Liang , Junshan Zhang

SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets

Model-based offline reinforcement Learning (RL) is a promising approach that leverages existing data effectively in many real-world applications, especially those involving high-dimensional inputs like images and videos. To alleviate the…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Shenghua Wan , Ziyuan Chen , Le Gan , Shuai Feng , De-Chuan Zhan

Model-Based Offline Planning with Trajectory Pruning

The recent offline reinforcement learning (RL) studies have achieved much progress to make RL usable in real-world systems by learning policies from pre-collected datasets without environment interaction. Unfortunately, existing offline RL…

Artificial Intelligence · Computer Science 2022-04-22 Xianyuan Zhan , Xiangyu Zhu , Haoran Xu

POPO: Pessimistic Offline Policy Optimization

Offline reinforcement learning (RL), also known as batch RL, aims to optimize policy from a large pre-recorded dataset without interaction with the environment. This setting offers the promise of utilizing diverse, pre-collected datasets to…

Machine Learning · Computer Science 2021-01-05 Qiang He , Xinwen Hou

Offline Reinforcement Learning from Images with Latent Space Models

Offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions. Offline RL enables extensive use and re-use of historical datasets, while also alleviating safety concerns…

Machine Learning · Computer Science 2020-12-22 Rafael Rafailov , Tianhe Yu , Aravind Rajeswaran , Chelsea Finn

CROP: Conservative Reward for Model-based Offline Policy Optimization

Offline reinforcement learning (RL) aims to optimize a policy using collected data without online interactions. Model-based approaches are particularly appealing for addressing offline RL challenges because of their capability to mitigate…

Machine Learning · Computer Science 2026-04-14 Hao Li , Xiao-Hu Zhou , Shu-Hai Li , Mei-Jiang Gui , Xiao-Liang Xie , Shi-Qi Liu , Shuang-Yi Wang , Zhen-Qiu Feng , Zeng-Guang Hou

Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief

Model-based offline reinforcement learning (RL) aims to find highly rewarding policy, by leveraging a previously collected static dataset and a dynamics model. While the dynamics model learned through reuse of the static dataset, its…

Machine Learning · Computer Science 2022-11-01 Kaiyang Guo , Yunfeng Shao , Yanhui Geng

MOReL : Model-Based Offline Reinforcement Learning

In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment. The ability to train RL policies offline can greatly expand the applicability…

Machine Learning · Computer Science 2021-03-03 Rahul Kidambi , Aravind Rajeswaran , Praneeth Netrapalli , Thorsten Joachims

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process. A key challenge of offline RL is…

Machine Learning · Computer Science 2022-06-16 Shentao Yang , Yihao Feng , Shujian Zhang , Mingyuan Zhou

Model-based Offline Policy Optimization with Adversarial Network

Model-based offline reinforcement learning (RL), which builds a supervised transition model with logging dataset to avoid costly interactions with the online environment, has been a promising approach for offline policy optimization. As the…

Machine Learning · Computer Science 2023-09-06 Junming Yang , Xingguo Chen , Shengyuan Wang , Bolei Zhang

Model-Based Offline Planning

Offline learning is a key part of making reinforcement learning (RL) useable in real systems. Offline RL looks at scenarios where there is data from a system's operation, but no direct access to the system when learning a policy. Recent…

Machine Learning · Computer Science 2021-03-18 Arthur Argenson , Gabriel Dulac-Arnold

VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning

Offline reinforcement learning (RL) learns effective policies from pre-collected datasets, offering a practical solution for applications where online interactions are risky or costly. Model-based approaches are particularly advantageous…

Machine Learning · Computer Science 2026-05-14 Xuyang Chen , Keyu Yan , Guojian Wang , Lin Zhao

COOPO: Cyclic Offline-Online Policy Optimization Algorithm

Offline reinforcement learning struggles with distributional shift and constrained performance due to static dataset limitations, while online RL demands prohibitive environment interactions. The recent advent of hybrid offline-to-online…

Machine Learning · Computer Science 2026-05-19 Qisai Liu , Zhanhong Jiang , Joshua Russell Waite , Aditya Balu , Cody Fleming , Soumik Sarkar

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. This can allow robots to acquire generalizable skills from large and diverse datasets, without any…

Machine Learning · Computer Science 2021-09-24 Aviral Kumar , Anikait Singh , Stephen Tian , Chelsea Finn , Sergey Levine

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Offline reinforcement learning (RL) offers a powerful paradigm for data-driven control. Compared to model-free approaches, offline model-based RL (MBRL) explicitly learns a world model from a static dataset and uses it as a surrogate…

Machine Learning · Computer Science 2026-02-02 Jiayu Chen , Le Xu , Aravind Venugopal , Jeff Schneider

Latent-Variable Advantage-Weighted Policy Optimization for Offline RL

Offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions. This setting is particularly well-suited for continuous control robotic…

Machine Learning · Computer Science 2022-03-18 Xi Chen , Ali Ghadirzadeh , Tianhe Yu , Yuan Gao , Jianhao Wang , Wenzhe Li , Bin Liang , Chelsea Finn , Chongjie Zhang

Behavior Proximal Policy Optimization

Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor-critic methods perform poorly due to the overestimation of out-of-distribution state-action pairs. Thus, various additional augmentations are…

Machine Learning · Computer Science 2023-02-23 Zifeng Zhuang , Kun Lei , Jinxin Liu , Donglin Wang , Yilang Guo

Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization

Offline reinforcement learning (RL) addresses the problem of learning a performant policy from a fixed batch of data collected by following some behavior policy. Model-based approaches are particularly appealing in the offline setting since…

Machine Learning · Computer Science 2023-03-06 Jihwan Jeong , Xiaoyu Wang , Michael Gimelfarb , Hyunwoo Kim , Baher Abdulhai , Scott Sanner

RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning

Offline reinforcement learning (RL) aims to find performant policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy…

Machine Learning · Computer Science 2022-10-12 Marc Rigter , Bruno Lacerda , Nick Hawes

Diffusion Policies for Risk-Averse Behavior Modeling in Offline Reinforcement Learning

Offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data. A central concern in this context is ensuring the safety of the learned policy by quantifying uncertainties associated with various…

Machine Learning · Computer Science 2025-07-03 Xiaocong Chen , Siyu Wang , Tong Yu , Lina Yao