Related papers: Improving Transformer World Models for Data-Effici…

Transformer-based World Models Are Happy With 100k Interactions

Deep neural networks have been successful in many reinforcement learning settings. However, compared to human learners they are overly data hungry. To build a sample-efficient world model, we apply a transformer to real-world episodes in an…

Machine Learning · Computer Science 2023-03-14 Jan Robine , Marc Höftmann , Tobias Uelwer , Stefan Harmeling

Accelerating Transformers in Online RL

The appearance of transformer-based models in Reinforcement Learning (RL) has expanded the horizons of possibilities in robotics tasks, but it has simultaneously brought a wide range of challenges during its implementation, especially in…

Machine Learning · Computer Science 2025-10-01 Daniil Zelezetsky , Alexey K. Kovalev , Aleksandr I. Panov

Finetuning Offline World Models in the Real World

Reinforcement Learning (RL) is notoriously data-inefficient, which makes training on a real robot difficult. While model-based RL algorithms (world models) improve data-efficiency to some extent, they still require hours or days of…

Machine Learning · Computer Science 2023-10-25 Yunhai Feng , Nicklas Hansen , Ziyan Xiong , Chandramouli Rajagopalan , Xiaolong Wang

Learning Transformer-based World Models with Contrastive Predictive Coding

The DreamerV3 algorithm recently obtained remarkable performance across diverse environment domains by learning an accurate world model based on Recurrent Neural Networks (RNNs). Following the success of model-based reinforcement learning…

Machine Learning · Computer Science 2025-05-27 Maxime Burchi , Radu Timofte

TransDreamer: Reinforcement Learning with Transformer World Models

The Dreamer agent provides various benefits of Model-Based Reinforcement Learning (MBRL) such as sample efficiency, reusable knowledge, and safe planning. However, its world model and policy networks inherit the limitations of recurrent…

Machine Learning · Computer Science 2024-11-20 Chang Chen , Yi-Fu Wu , Jaesik Yoon , Sungjin Ahn

Dream to Adapt: Meta Reinforcement Learning by Latent Context Imagination and MDP Imagination

Meta reinforcement learning (Meta RL) has been amply explored to quickly learn an unseen task by transferring previously learned knowledge from similar tasks. However, most state-of-the-art algorithms require the meta-training tasks to have…

Machine Learning · Computer Science 2023-11-14 Lu Wen , Songan Zhang , H. Eric Tseng , Huei Peng

PWM: Policy Learning with Multi-Task World Models

Reinforcement Learning (RL) has made significant strides in complex tasks but struggles in multi-task settings with different embodiments. World model methods offer scalability by learning a simulation of the environment but often rely on…

Machine Learning · Computer Science 2025-02-25 Ignat Georgiev , Varun Giridhar , Nicklas Hansen , Animesh Garg

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL) has motivated similar approaches for fine-tuning vision-language-action (VLA) models in robotics. Many recent works fine-tune VLAs…

Robotics · Computer Science 2026-03-31 Andrew Choi , Xinjie Wang , Zhizhong Su , Wei Xu

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates. We model episode sessions - parts of the episode where the latent state is…

Machine Learning · Computer Science 2024-12-05 Anthony Liang , Guy Tennenholtz , Chih-wei Hsu , Yinlam Chow , Erdem Bıyık , Craig Boutilier

AutoTrans: Automating Transformer Design via Reinforced Architecture Search

Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has…

Computation and Language · Computer Science 2021-06-01 Wei Zhu , Xiaoling Wang , Xipeng Qiu , Yuan Ni , Guotong Xie

VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model

The goal of this paper is to improve the performance and reliability of vision-language-action (VLA) models through iterative online interaction. Since collecting policy rollouts in the real world is expensive, we investigate whether a…

Robotics · Computer Science 2026-02-17 Yanjiang Guo , Tony Lee , Lucy Xiaoyang Shi , Jianyu Chen , Percy Liang , Chelsea Finn

DyMoDreamer: World Modeling with Dynamic Modulation

A critical bottleneck in deep reinforcement learning (DRL) is sample inefficiency, as training high-performance agents often demands extensive environmental interactions. Model-based reinforcement learning (MBRL) mitigates this by building…

Machine Learning · Computer Science 2025-09-30 Boxuan Zhang , Runqing Wang , Wei Xiao , Weipu Zhang , Jian Sun , Gao Huang , Jie Chen , Gang Wang

Learning Goal-Conditioned Representations for Language Reward Models

Techniques that learn improved representations via offline data or self-supervised objectives have shown impressive results in traditional reinforcement learning (RL). Nevertheless, it is unclear how improved representation learning can…

Computation and Language · Computer Science 2024-10-25 Vaskar Nath , Dylan Slack , Jeff Da , Yuntao Ma , Hugh Zhang , Spencer Whitehead , Sean Hendryx

DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions

Diffusion-based world models have demonstrated strong capabilities in synthesizing realistic long-horizon trajectories for offline reinforcement learning (RL). However, many existing methods do not directly generate actions alongside states…

Machine Learning · Computer Science 2026-05-14 Zongyue Li , Xiao Han , Yusong Li , Niklas Strauss , Matthias Schubert

Efficient World Models with Context-Aware Tokenization

Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to…

Machine Learning · Computer Science 2024-06-28 Vincent Micheli , Eloi Alonso , François Fleuret

Accelerating Model-Based Reinforcement Learning with State-Space World Models

Reinforcement learning (RL) is a powerful approach for robot learning. However, model-free RL (MFRL) requires a large number of environment interactions to learn successful control policies. This is due to the noisy RL training updates and…

Robotics · Computer Science 2025-02-28 Maria Krinner , Elie Aljalbout , Angel Romero , Davide Scaramuzza

DayDreamer: World Models for Physical Robot Learning

To solve tasks in complex environments, robots need to learn from experience. Deep reinforcement learning is a common approach to robot learning but requires a large amount of trial and error to learn, limiting its deployment in the…

Robotics · Computer Science 2022-06-29 Philipp Wu , Alejandro Escontrela , Danijar Hafner , Ken Goldberg , Pieter Abbeel

The Effectiveness of World Models for Continual Reinforcement Learning

World models power some of the most efficient reinforcement learning algorithms. In this work, we showcase that they can be harnessed for continual learning - a situation when the agent faces changing environments. World models typically…

Machine Learning · Computer Science 2023-07-14 Samuel Kessler , Mateusz Ostaszewski , Michał Bortkiewicz , Mateusz Żarski , Maciej Wołczyk , Jack Parker-Holder , Stephen J. Roberts , Piotr Miłoś

Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning

Interacting with the actual environment to acquire data is often costly and time-consuming in robotic tasks. Model-based offline reinforcement learning (RL) provides a feasible solution. On the one hand, it eliminates the requirements of…

Machine Learning · Computer Science 2023-10-17 Pengqin Wang , Meixin Zhu , Shaojie Shen

Learning to Play Atari in a World of Tokens

Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extended context, resulting in more accurate world models. However, for complex reasoning and planning…

Machine Learning · Computer Science 2024-06-04 Pranav Agarwal , Sheldon Andrews , Samira Ebrahimi Kahou