Related papers: Semantic World Models

From Generative Engines to Actionable Simulators: The Imperative of Physical Grounding in World Models

A world model is an AI system that simulates how an environment evolves under actions, enabling planning through imagined futures rather than reactive perception. Current world models, however, suffer from visual conflation: the mistaken…

Artificial Intelligence · Computer Science 2026-01-23 Zhikang Chen , Tingting Zhu

A Step Toward World Models: A Survey on Robotic Manipulation

Autonomous agents are increasingly expected to operate in complex, dynamic, and uncertain environments, performing tasks such as manipulation, navigation, and decision-making. Achieving these capabilities requires agents to understand the…

Robotics · Computer Science 2025-11-11 Peng-Fei Zhang , Ying Cheng , Xiaofan Sun , Shijie Wang , Fengling Li , Lei Zhu , Heng Tao Shen

Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models

World model-based policy evaluation is a practical proxy for testing real-world robot control by rolling out candidate actions in action-conditioned video diffusion models. As these models increasingly adopt latent diffusion modeling (LDM),…

Computer Vision and Pattern Recognition · Computer Science 2026-05-08 Nilaksh , Saurav Jha , Artem Zholus , Sarath Chandar

From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models

Our aim is to learn to solve long-horizon decision-making problems in complex robotics domains given low-level skills and a handful of short-horizon demonstrations containing sequences of images. To this end, we focus on learning abstract…

Robotics · Computer Science 2026-03-10 Ashay Athalye , Nishanth Kumar , Tom Silver , Yichao Liang , Jiuguang Wang , Tomás Lozano-Pérez , Leslie Pack Kaelbling

Understanding World or Predicting Future? A Comprehensive Survey of World Models

The concept of world models has garnered significant attention due to advancements in multimodal large language models such as GPT-4 and video generation models such as Sora, which are central to the pursuit of artificial general…

Computation and Language · Computer Science 2025-12-11 Jingtao Ding , Yunke Zhang , Yu Shang , Jie Feng , Yuheng Zhang , Zefang Zong , Yuan Yuan , Hongyuan Su , Nian Li , Jinghua Piao , Yucheng Deng , Nicholas Sukiennik , Chen Gao , Fengli Xu , Yong Li

Causal World Models by Unsupervised Deconfounding of Physical Dynamics

The capability of imagining internally with a mental model of the world is vitally important for human cognition. If a machine intelligent agent can learn a world model to create a "dream" environment, it can then internally ask what-if…

Machine Learning · Computer Science 2020-12-29 Minne Li , Mengyue Yang , Furui Liu , Xu Chen , Zhitang Chen , Jun Wang

H-WM: Robotic Task and Motion Planning Guided by Hierarchical World Model

World models are becoming central to robotic planning and control as they enable prediction of future state transitions. Existing approaches often emphasize video generation or natural-language prediction, which are difficult to ground in…

Robotics · Computer Science 2026-03-05 Jinbang Huang , Wenyuan Chen , Zhiyuan Li , Oscar Pang , Xiao Hu , Lingfeng Zhang , Yuanzhao Hu , Zhanguang Zhang , Mark Coates , Tongtong Cao , Xingyue Quan , Yingxue Zhang

Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics

Learning robust and generalizable world models is crucial for enabling efficient and scalable robotic control in real-world environments. In this work, we introduce a novel framework for learning world models that accurately capture…

Robotics · Computer Science 2025-12-16 Chenhao Li , Andreas Krause , Marco Hutter

Object-Centric World Model for Language-Guided Manipulation

A world model is essential for an agent to predict the future and plan in domains such as autonomous driving and robotics. To achieve this, recent advancements have focused on video generation, which has gained significant attention due to…

Artificial Intelligence · Computer Science 2025-03-13 Youngjoon Jeong , Junha Chun , Soonwoo Cha , Taesup Kim

Interactive World Simulator for Robot Policy Training and Evaluation

Action-conditioned video prediction models (often referred to as world models) have shown strong potential for robotics applications, but existing approaches are often slow and struggle to capture physically consistent interactions over…

Robotics · Computer Science 2026-03-10 Yixuan Wang , Rhythm Syed , Fangyu Wu , Mengchao Zhang , Aykut Onol , Jose Barreiros , Hooshang Nayyeri , Tony Dear , Huan Zhang , Yunzhu Li

World Model for Robot Learning: A Comprehensive Survey

World models, which are predictive representations of how environments evolve under actions, have become a central component of robot learning. They support policy learning, planning, simulation, evaluation, data generation, and have…

Robotics · Computer Science 2026-05-04 Bohan Hou , Gen Li , Jindou Jia , Tuo An , Xinying Guo , Sicong Leng , Haoran Geng , Yanjie Ze , Tatsuya Harada , Philip Torr , Oier Mees , Marc Pollefeys , Zhuang Liu , Jiajun Wu , Pieter Abbeel , Jitendra Malik , Yilun Du , Jianfei Yang

GWM: Towards Scalable Gaussian World Models for Robotic Manipulation

Training robot policies within a learned world model is trending due to the inefficiency of real-world interactions. The established image-based world models and policies have shown prior success, but lack robust geometric information that…

Robotics · Computer Science 2025-09-18 Guanxing Lu , Baoxiong Jia , Puhao Li , Yixin Chen , Ziwei Wang , Yansong Tang , Siyuan Huang

World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks

World models are emerging as a transformative paradigm in artificial intelligence, enabling agents to construct internal representations of their environments for predictive reasoning, planning, and decision-making. By learning latent…

Artificial Intelligence · Computer Science 2025-06-03 Changyuan Zhao , Ruichen Zhang , Jiacheng Wang , Gaosheng Zhao , Dusit Niyato , Geng Sun , Shiwen Mao , Dong In Kim

Neural World Models for Computer Vision

Humans navigate in their environment by learning a mental model of the world through passive observation and active interaction. Their world model allows them to anticipate what might happen next and act accordingly with respect to an…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Anthony Hu

Semantic Task Planning for Service Robots in Open World

In this paper, we present a planning system based on semantic reasoning for a general-purpose service robot, which is aimed at behaving more intelligently in domains that contain incomplete information, under-specified goals, and dynamic…

Robotics · Computer Science 2020-11-03 Guowei Cui , Wei Shuai , Xiaoping Chen

A Picture is Worth a Thousand Words: Language Models Plan from Pixels

Planning is an important capability of artificial agents that perform long-horizon tasks in real-world environments. In this work, we explore the use of pre-trained language models (PLMs) to reason about plan sequences from text…

Computation and Language · Computer Science 2023-03-17 Anthony Z. Liu , Lajanugen Logeswaran , Sungryull Sohn , Honglak Lee

Improving Information Extraction from Images with Learned Semantic Models

Many applications require an understanding of an image that goes beyond the simple detection and classification of its objects. In particular, a great deal of semantic information is carried in the relationships between objects. We have…

Artificial Intelligence · Computer Science 2018-08-28 Stephan Baier , Yunpu Ma , Volker Tresp

Planning with Reasoning using Vision Language World Model

Effective planning requires strong world models, but high-level world models that can understand and reason about actions with semantic and temporal abstraction remain largely underdeveloped. We introduce the Vision Language World Model…

Artificial Intelligence · Computer Science 2025-09-09 Delong Chen , Theo Moutakanni , Willy Chung , Yejin Bang , Ziwei Ji , Allen Bolourchi , Pascale Fung

Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

World models have recently re-emerged as a central paradigm for embodied intelligence, robotics, autonomous driving, and model-based reinforcement learning. However, current world model research is often dominated by three partially…

Artificial Intelligence · Computer Science 2026-05-27 Sen Cui , Jingheng Ma

Foundational Models Defining a New Era in Vision: A Survey and Outlook

Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The complex relations between objects and their locations, ambiguities, and variations in the real-world…

Computer Vision and Pattern Recognition · Computer Science 2023-07-27 Muhammad Awais , Muzammal Naseer , Salman Khan , Rao Muhammad Anwer , Hisham Cholakkal , Mubarak Shah , Ming-Hsuan Yang , Fahad Shahbaz Khan