Related papers: Contrastive Example-Based Control

Contrastive Value Learning: Implicit Models for Simple Offline RL

Model-based reinforcement learning (RL) methods are appealing in the offline setting because they allow an agent to reason about the consequences of actions without interacting with the environment. Prior methods learn a 1-step dynamics…

Machine Learning · Computer Science 2022-11-07 Bogdan Mazoure , Benjamin Eysenbach , Ofir Nachum , Jonathan Tompson

Offline Reinforcement Learning for Road Traffic Control

Traffic signal control is an important problem in urban mobility with a significant potential of economic and environmental impact. While there is a growing interest in Reinforcement Learning (RL) for traffic signal control, the work so far…

Artificial Intelligence · Computer Science 2022-12-13 Mayuresh Kunjir , Sanjay Chawla

Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief

Model-based offline reinforcement learning (RL) aims to find highly rewarding policy, by leveraging a previously collected static dataset and a dynamics model. While the dynamics model learned through reuse of the static dataset, its…

Machine Learning · Computer Science 2022-11-01 Kaiyang Guo , Yunfeng Shao , Yanhui Geng

CROP: Conservative Reward for Model-based Offline Policy Optimization

Offline reinforcement learning (RL) aims to optimize a policy using collected data without online interactions. Model-based approaches are particularly appealing for addressing offline RL challenges because of their capability to mitigate…

Machine Learning · Computer Science 2026-04-14 Hao Li , Xiao-Hu Zhou , Shu-Hai Li , Mei-Jiang Gui , Xiao-Liang Xie , Shi-Qi Liu , Shuang-Yi Wang , Zhen-Qiu Feng , Zeng-Guang Hou

A stabilizing reinforcement learning approach for sampled systems with partially unknown models

Reinforcement learning is commonly associated with training of reward-maximizing (or cost-minimizing) agents, in other words, controllers. It can be applied in model-free or model-based fashion, using a priori or online collected system…

Systems and Control · Electrical Eng. & Systems 2022-09-01 Lukas Beckenbach , Pavel Osinenko , Stefan Streif

Tractable Offline Learning of Regular Decision Processes

This work studies offline Reinforcement Learning (RL) in a class of non-Markovian environments called Regular Decision Processes (RDPs). In RDPs, the unknown dependency of future observations and rewards from the past interactions can be…

Machine Learning · Computer Science 2024-09-05 Ahana Deb , Roberto Cipollone , Anders Jonsson , Alessandro Ronca , Mohammad Sadegh Talebi

Revisiting Design Choices in Offline Model-Based Reinforcement Learning

Offline reinforcement learning enables agents to leverage large pre-collected datasets of environment transitions to learn control policies, circumventing the need for potentially expensive or unsafe online data collection. Significant…

Machine Learning · Computer Science 2022-03-17 Cong Lu , Philip J. Ball , Jack Parker-Holder , Michael A. Osborne , Stephen J. Roberts

Learning Abstract Models for Strategic Exploration and Fast Reward Transfer

Model-based reinforcement learning (RL) is appealing because (i) it enables planning and thus more strategic exploration, and (ii) by decoupling dynamics from rewards, it enables fast transfer to new reward functions. However, learning an…

Machine Learning · Computer Science 2020-07-14 Evan Zheran Liu , Ramtin Keramati , Sudarshan Seshadri , Kelvin Guu , Panupong Pasupat , Emma Brunskill , Percy Liang

Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning

Interacting with the actual environment to acquire data is often costly and time-consuming in robotic tasks. Model-based offline reinforcement learning (RL) provides a feasible solution. On the one hand, it eliminates the requirements of…

Machine Learning · Computer Science 2023-10-17 Pengqin Wang , Meixin Zhu , Shaojie Shen

Offline Reinforcement Learning with Imputed Rewards

Offline Reinforcement Learning (ORL) offers a robust solution to training agents in applications where interactions with the environment must be strictly limited due to cost, safety, or lack of accurate simulation environments. Despite its…

Machine Learning · Computer Science 2024-07-16 Carlo Romeo , Andrew D. Bagdanov

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. This can allow robots to acquire generalizable skills from large and diverse datasets, without any…

Machine Learning · Computer Science 2021-09-24 Aviral Kumar , Anikait Singh , Stephen Tian , Chelsea Finn , Sergey Levine

MOPO: Model-based Offline Policy Optimization

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data. This problem setting offers the promise of utilizing such datasets to acquire policies without any…

Machine Learning · Computer Science 2020-11-24 Tianhe Yu , Garrett Thomas , Lantao Yu , Stefano Ermon , James Zou , Sergey Levine , Chelsea Finn , Tengyu Ma

Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation

Reinforcement learning (RL) has been widely applied in recommendation systems due to its potential in optimizing the long-term engagement of users. From the perspective of RL, recommendation can be formulated as a Markov decision process…

Information Retrieval · Computer Science 2023-10-26 Chengpeng Li , Zhengyi Yang , Jizhi Zhang , Jiancan Wu , Dingxian Wang , Xiangnan He , Xiang Wang

Model-Based Reinforcement Learning Under Confounding

We investigate model-based reinforcement learning in contextual Markov decision processes (C-MDPs) in which the context is unobserved and induces confounding in the offline dataset. In such settings, conventional model-learning methods are…

Machine Learning · Computer Science 2025-12-09 Nishanth Venkatesh , Andreas A. Malikopoulos

Model-Based Reinforcement Learning with Multi-Task Offline Pretraining

Pretraining reinforcement learning (RL) models on offline datasets is a promising way to improve their training efficiency in online tasks, but challenging due to the inherent mismatch in dynamics and behaviors across various tasks. We…

Machine Learning · Computer Science 2024-06-06 Minting Pan , Yitao Zheng , Yunbo Wang , Xiaokang Yang

Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective

Inverse Reinforcement Learning (IRL) -- the problem of learning reward functions from demonstrations of an \emph{expert policy} -- plays a critical role in developing intelligent systems. While widely used in applications, theoretical…

Machine Learning · Statistics 2024-02-13 Lei Zhao , Mengdi Wang , Yu Bai

Offline Reinforcement Learning from Images with Latent Space Models

Offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions. Offline RL enables extensive use and re-use of historical datasets, while also alleviating safety concerns…

Machine Learning · Computer Science 2020-12-22 Rafael Rafailov , Tianhe Yu , Aravind Rajeswaran , Chelsea Finn

Offline Meta-Reinforcement Learning with Online Self-Supervision

Meta-reinforcement learning (RL) methods can meta-train policies that adapt to new tasks with orders of magnitude less data than standard RL, but meta-training itself is costly and time-consuming. If we can meta-train on offline data, then…

Machine Learning · Computer Science 2022-07-08 Vitchyr H. Pong , Ashvin Nair , Laura Smith , Catherine Huang , Sergey Levine

Self-Supervised Reinforcement Learning that Transfers using Random Features

Model-free reinforcement learning algorithms have exhibited great potential in solving single-task sequential decision-making problems with high-dimensional observations and long horizons, but are known to be hard to generalize across…

Machine Learning · Computer Science 2023-05-30 Boyuan Chen , Chuning Zhu , Pulkit Agrawal , Kaiqing Zhang , Abhishek Gupta

Experience enrichment based task independent reward model

For most reinforcement learning approaches, the learning is performed by maximizing an accumulative reward that is expectedly and manually defined for specific tasks. However, in real world, rewards are emergent phenomena from the complex…

Artificial Intelligence · Computer Science 2017-05-23 Min Xu