Related papers: Genie: Generative Interactive Environments

Learning Generative Interactive Environments By Trained Agent Exploration

World models are increasingly pivotal in interpreting and simulating the rules and actions of complex environments. Genie, a recent model, excels at learning from visually diverse environments but relies on costly human-collected data. We…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Naser Kazemi , Nedko Savov , Danda Paudel , Luc Van Gool

Exploration-Driven Generative Interactive Environments

Modern world models require costly and time-consuming collection of large video datasets with action demonstrations by people or by environment-specific agents. To simplify training, we focus on using many virtual environments for…

Computer Vision and Pattern Recognition · Computer Science 2025-04-04 Nedko Savov , Naser Kazemi , Mohammad Mahdi , Danda Pani Paudel , Xi Wang , Luc Van Gool

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

We introduce Genie Envisioner (GE), a unified world foundation platform for robotic manipulation that integrates policy learning, evaluation, and simulation within a single video-generative framework. At its core, GE-Base is a large-scale,…

Robotics · Computer Science 2025-11-05 Yue Liao , Pengfei Zhou , Siyuan Huang , Donglin Yang , Shengcong Chen , Yuxin Jiang , Yue Hu , Jingbin Cai , Si Liu , Jianlan Luo , Liliang Chen , Shuicheng Yan , Maoqing Yao , Guanghui Ren

Learning Interactive Real-World Simulators

Generative models trained on internet data have revolutionized how text, image, and video content can be created. Perhaps the next milestone for generative models is to simulate realistic experience in response to actions taken by humans,…

Artificial Intelligence · Computer Science 2024-09-27 Sherry Yang , Yilun Du , Kamyar Ghasemipour , Jonathan Tompson , Leslie Kaelbling , Dale Schuurmans , Pieter Abbeel

Surgical Vision World Model

Realistic and interactive surgical simulation has the potential to facilitate crucial applications, such as medical professional training and autonomous surgical agent training. In the natural visual domain, world models have enabled…

Image and Video Processing · Electrical Eng. & Systems 2025-12-16 Saurabh Koju , Saurav Bastola , Prashant Shrestha , Sanskar Amgain , Yash Raj Shrestha , Rudra P. K. Poudel , Binod Bhattarai

Animate Any Character in Any World

Recent advances in world models have greatly enhanced interactive environment simulation. Existing methods mainly fall into two categories: (1) static world generation models, which construct 3D environments without active agents, and (2)…

Computer Vision and Pattern Recognition · Computer Science 2025-12-22 Yitong Wang , Fangyun Wei , Hongyang Zhang , Bo Dai , Yan Lu

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Generative latent-variable models are emerging as promising tools in robotics and reinforcement learning. Yet, even though tasks in these domains typically involve distinct objects, most state-of-the-art generative models do not explicitly…

Machine Learning · Computer Science 2020-11-24 Martin Engelcke , Adam R. Kosiorek , Oiwi Parker Jones , Ingmar Posner

Playable Environments: Video Manipulation in Space and Time

We present Playable Environments - a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a…

Computer Vision and Pattern Recognition · Computer Science 2022-03-17 Willi Menapace , Stéphane Lathuilière , Aliaksandr Siarohin , Christian Theobalt , Sergey Tulyakov , Vladislav Golyanik , Elisa Ricci

Generating Videos with Scene Dynamics

We capitalize on large amounts of unlabeled video in order to learn a model of scene dynamics for both video recognition tasks (e.g. action classification) and video generation tasks (e.g. future prediction). We propose a generative…

Computer Vision and Pattern Recognition · Computer Science 2016-10-27 Carl Vondrick , Hamed Pirsiavash , Antonio Torralba

World Models

We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the…

Machine Learning · Computer Science 2018-05-10 David Ha , Jürgen Schmidhuber

GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving

Generative models offer a scalable and flexible paradigm for simulating complex environments, yet current approaches fall short in addressing the domain-specific requirements of autonomous driving - such as multi-agent interactions,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-27 Lloyd Russell , Anthony Hu , Lorenzo Bertoni , George Fedoseev , Jamie Shotton , Elahe Arani , Gianluca Corrado

GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training

Post-training GUI agents in interactive environments is critical for developing generalization and long-horizon planning capabilities. However, training on real-world applications is hindered by high latency, poor reproducibility, and…

Artificial Intelligence · Computer Science 2026-02-17 Yuan Cao , Dezhi Ran , Mengzhou Wu , Yuzhe Guo , Xin Chen , Ang Li , Gang Cao , Gong Zhi , Hao Yu , Linyi Li , Wei Yang , Tao Xie

Generative Temporal Models with Spatial Memory for Partially Observed Environments

In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent's representations during training or via use as part of an explicit planning…

Machine Learning · Statistics 2018-07-20 Marco Fraccaro , Danilo Jimenez Rezende , Yori Zwols , Alexander Pritzel , S. M. Ali Eslami , Fabio Viola

Genie: Achieving Human Parity in Content-Grounded Datasets Generation

The lack of high-quality data for content-grounded generation tasks has been identified as a major obstacle to advancing these tasks. To address this gap, we propose Genie, a novel method for automatically generating high-quality…

Computation and Language · Computer Science 2024-01-26 Asaf Yehudai , Boaz Carmeli , Yosi Mass , Ofir Arviv , Nathaniel Mills , Assaf Toledo , Eyal Shnarch , Leshem Choshen

Yume: An Interactive World Generation Model

Yume aims to use images, text, or videos to create an interactive, realistic, and dynamic world, which allows exploration and control using peripheral devices or neural signals. In this report, we present a preview version of \method, which…

Computer Vision and Pattern Recognition · Computer Science 2025-07-24 Xiaofeng Mao , Shaoheng Lin , Zhen Li , Chuanhao Li , Wenshuo Peng , Tong He , Jiangmiao Pang , Mingmin Chi , Yu Qiao , Kaipeng Zhang

Simulating the Visual World with Artificial Intelligence: A Roadmap

The landscape of video generation is shifting, from a focus on generating visually appealing clips to building virtual environments that support interaction and maintain physical plausibility. These developments point toward the emergence…

Artificial Intelligence · Computer Science 2026-02-09 Jingtong Yue , Ziqi Huang , Zhaoxi Chen , Xintao Wang , Pengfei Wan , Ziwei Liu

Recurrent World Models Facilitate Policy Evolution

A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into…

Machine Learning · Computer Science 2018-09-07 David Ha , Jürgen Schmidhuber

Video Generation From Text

Generating videos from text has proven to be a significant challenge for existing generative models. We tackle this problem by training a conditional generative model to extract both static and dynamic information from text. This is…

Multimedia · Computer Science 2017-10-03 Yitong Li , Martin Renqiang Min , Dinghan Shen , David Carlson , Lawrence Carin

Position: Interactive Generative Video as Next-Generation Game Engine

Modern game development faces significant challenges in creativity and cost due to predetermined content in traditional game engines. Recent breakthroughs in video generation models, capable of synthesizing realistic and interactive virtual…

Computer Vision and Pattern Recognition · Computer Science 2025-05-30 Jiwen Yu , Yiran Qin , Haoxuan Che , Quande Liu , Xintao Wang , Pengfei Wan , Di Zhang , Xihui Liu

GenEx: Generating an Explorable World

Understanding, navigating, and exploring the 3D physical real world has long been a central challenge in the development of artificial intelligence. In this work, we take a step toward this goal by introducing GenEx, a system capable of…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Taiming Lu , Tianmin Shu , Junfei Xiao , Luoxin Ye , Jiahao Wang , Cheng Peng , Chen Wei , Daniel Khashabi , Rama Chellappa , Alan Yuille , Jieneng Chen