English
Related papers

Related papers: Learning Interactive Real-World Simulators

200 papers

Action-conditioned video prediction models (often referred to as world models) have shown strong potential for robotics applications, but existing approaches are often slow and struggle to capture physically consistent interactions over…

Video generative models pre-trained on large-scale internet datasets have achieved remarkable success, excelling at producing realistic synthetic videos. However, they often generate clips based on static prompts (e.g., text or images),…

Computer Vision and Pattern Recognition · Computer Science 2025-02-13 Haoran He , Yang Zhang , Liang Lin , Zhongwen Xu , Ling Pan

The landscape of video generation is shifting, from a focus on generating visually appealing clips to building virtual environments that support interaction and maintain physical plausibility. These developments point toward the emergence…

Artificial Intelligence · Computer Science 2026-02-09 Jingtong Yue , Ziqi Huang , Zhaoxi Chen , Xintao Wang , Pengfei Wan , Ziwei Liu

Understanding and replicating the real world is a critical challenge in Artificial General Intelligence (AGI) research. To achieve this, many existing approaches, such as world models, aim to capture the fundamental principles governing the…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Yuqi Hu , Longguang Wang , Xian Liu , Ling-Hao Chen , Yuwei Guo , Yukai Shi , Ce Liu , Anyi Rao , Zeyu Wang , Hui Xiong

Simulation is a crucial component of any robotic system. In order to simulate correctly, we need to write complex rules of the environment: how dynamic agents behave, and how the actions of each of the agents affect the behavior of others.…

Computer Vision and Pattern Recognition · Computer Science 2020-05-26 Seung Wook Kim , Yuhao Zhou , Jonah Philion , Antonio Torralba , Sanja Fidler

User simulation is an emerging interdisciplinary topic with multiple critical applications in the era of Generative AI. It involves creating an intelligent agent that mimics the actions of a human user interacting with an AI system,…

Artificial Intelligence · Computer Science 2026-04-22 Krisztian Balog , ChengXiang Zhai

A world model is an AI system that simulates how an environment evolves under actions, enabling planning through imagined futures rather than reactive perception. Current world models, however, suffer from visual conflation: the mistaken…

Artificial Intelligence · Computer Science 2026-01-23 Zhikang Chen , Tingting Zhu

People interact with the real-world largely dependent on visual signal, which are ubiquitous and illustrate detailed demonstrations. In this paper, we explore utilizing visual signals as a new interface for models to interact with the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Wentao Zhang , Junliang Guo , Tianyu He , Li Zhao , Linli Xu , Jiang Bian

Generalist robot manipulators need to learn a wide variety of manipulation skills across diverse environments. Current robot training pipelines rely on humans to provide kinesthetic demonstrations or to program simulation environments and…

Robotics · Computer Science 2023-10-30 Pushkal Katara , Zhou Xian , Katerina Fragkiadaki

We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the…

Machine Learning · Computer Science 2018-05-10 David Ha , Jürgen Schmidhuber

Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative…

Human-Computer Interaction · Computer Science 2023-08-08 Joon Sung Park , Joseph C. O'Brien , Carrie J. Cai , Meredith Ringel Morris , Percy Liang , Michael S. Bernstein

Imitation learning has emerged as a promising approach towards building generalist robots. However, scaling imitation learning for large robot foundation models remains challenging due to its reliance on high-quality expert demonstrations.…

Robotics · Computer Science 2025-05-26 Chuning Zhu , Raymond Yu , Siyuan Feng , Benjamin Burchfiel , Paarth Shah , Abhishek Gupta

Developing and testing user interfaces (UIs) and training AI agents to interact with them are challenging due to the dynamic and diverse nature of real-world mobile environments. Existing methods often rely on cumbersome physical devices or…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Jiannan Xiang , Yun Zhu , Lei Shu , Maria Wang , Lijun Yu , Gabriel Barcik , James Lyon , Srinivas Sunkara , Jindong Chen

We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos. LidarDM stands out with two unprecedented capabilities in LiDAR generative…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Vlas Zyrianov , Henry Che , Zhijian Liu , Shenlong Wang

Generalization to unseen real-world scenarios for robot manipulation requires exposure to diverse datasets during training. However, collecting large real-world datasets is intractable due to high operational costs. For robot learning to…

Robotics · Computer Science 2024-09-04 Zoey Chen , Zhao Mandi , Homanga Bharadhwaj , Mohit Sharma , Shuran Song , Abhishek Gupta , Vikash Kumar

Simulation has the potential to massively scale evaluation of self-driving systems enabling rapid development as well as safe deployment. To close the gap between simulation and the real world, we need to simulate realistic multi-agent…

Robotics · Computer Science 2021-01-19 Simon Suo , Sebastian Regalado , Sergio Casas , Raquel Urtasun

Training robot policies within a learned world model is trending due to the inefficiency of real-world interactions. The established image-based world models and policies have shown prior success, but lack robust geometric information that…

Robotics · Computer Science 2025-09-18 Guanxing Lu , Baoxiong Jia , Puhao Li , Yixin Chen , Ziwei Wang , Yansong Tang , Siyuan Huang

We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described…

Users interact with text, image, code, or other editors on a daily basis. However, machine learning models are rarely trained in the settings that reflect the interactivity between users and their editor. This is understandable as training…

Computation and Language · Computer Science 2023-11-14 Felix Faltings , Michel Galley , Baolin Peng , Kianté Brantley , Weixin Cai , Yizhe Zhang , Jianfeng Gao , Bill Dolan

Collecting large amounts of real-world interaction data to train general robotic policies is often prohibitively expensive, thus motivating the use of simulation data. However, existing methods for data generation have generally focused on…

Machine Learning · Computer Science 2024-01-23 Lirui Wang , Yiyang Ling , Zhecheng Yuan , Mohit Shridhar , Chen Bao , Yuzhe Qin , Bailin Wang , Huazhe Xu , Xiaolong Wang
‹ Prev 1 2 3 10 Next ›