Related papers: Four Principles for Physically Interpretable World…

Physically Interpretable World Models via Weakly Supervised Representation Learning

Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability,…

Machine Learning · Computer Science 2026-04-07 Zhenjiang Mao , Mrinall Eashaan Umasudhan , Ivan Ruchkin

Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

World models have recently re-emerged as a central paradigm for embodied intelligence, robotics, autonomous driving, and model-based reinforcement learning. However, current world model research is often dominated by three partially…

Artificial Intelligence · Computer Science 2026-05-27 Sen Cui , Jingheng Ma

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

The next generation of autonomous agents must not only learn efficiently but also act reliably and adapt their behavior in open worlds. Standard approaches typically assume fixed tasks and environments with little or no novelty, which…

Machine Learning · Computer Science 2026-03-02 Florent Delgrange

World Models Should Prioritize the Unification of Physical and Social Dynamics

World models, which explicitly learn environmental dynamics to lay the foundation for planning, reasoning, and decision-making, are rapidly advancing in predicting both physical dynamics and aspects of social behavior, yet predominantly in…

Computers and Society · Computer Science 2025-10-27 Xiaoyuan Zhang , Chengdong Ma , Yizhe Huang , Weidong Huang , Siyuan Qi , Song-Chun Zhu , Xue Feng , Yaodong Yang

World Model-based Perception for Visual Legged Locomotion

Legged locomotion over various terrains is challenging and requires precise perception of the robot and its surroundings from both proprioception and vision. However, learning directly from high-dimensional visual input is often…

Robotics · Computer Science 2024-09-26 Hang Lai , Jiahang Cao , Jiafeng Xu , Hongtao Wu , Yunfeng Lin , Tao Kong , Yong Yu , Weinan Zhang

A Step Toward World Models: A Survey on Robotic Manipulation

Autonomous agents are increasingly expected to operate in complex, dynamic, and uncertain environments, performing tasks such as manipulation, navigation, and decision-making. Achieving these capabilities requires agents to understand the…

Robotics · Computer Science 2025-11-11 Peng-Fei Zhang , Ying Cheng , Xiaofan Sun , Shijie Wang , Fengling Li , Lei Zhu , Heng Tao Shen

Learning Interpretable Concept-Based Models with Human Feedback

Machine learning models that first learn a representation of a domain in terms of human-understandable concepts, then use it to make predictions, have been proposed to facilitate interpretation and interaction with models trained on…

Machine Learning · Computer Science 2020-12-08 Isaac Lage , Finale Doshi-Velez

Causal World Models by Unsupervised Deconfounding of Physical Dynamics

The capability of imagining internally with a mental model of the world is vitally important for human cognition. If a machine intelligent agent can learn a world model to create a "dream" environment, it can then internally ask what-if…

Machine Learning · Computer Science 2020-12-29 Minne Li , Mengyue Yang , Furui Liu , Xu Chen , Zhitang Chen , Jun Wang

WoW: Towards a World omniscient World model Through Embodied Interaction

Humans develop an understanding of intuitive physics through active interaction with the world. This approach is in stark contrast to current video models, such as Sora, which rely on passive observation and therefore struggle with grasping…

Robotics · Computer Science 2025-10-17 Xiaowei Chi , Peidong Jia , Chun-Kai Fan , Xiaozhu Ju , Weishi Mi , Kevin Zhang , Zhiyuan Qin , Wanxin Tian , Kuangzhi Ge , Hao Li , Zezhong Qian , Anthony Chen , Qiang Zhou , Yueru Jia , Jiaming Liu , Yong Dai , Qingpo Wuwu , Chengyu Bai , Yu-Kai Wang , Ying Li , Lizhang Chen , Yong Bao , Zhiyuan Jiang , Jiacheng Zhu , Kai Tang , Ruichuan An , Yulin Luo , Qiuxuan Feng , Siyuan Zhou , Chi-min Chan , Chengkai Hou , Wei Xue , Sirui Han , Yike Guo , Shanghang Zhang , Jian Tang

Interpretable machine learning models: a physics-based view

To understand changes in physical systems and facilitate decisions, explaining how model predictions are made is crucial. We use model-based interpretability, where models of physical systems are constructed by composing basic constructs…

Artificial Intelligence · Computer Science 2020-03-24 Ion Matei , Johan de Kleer , Christoforos Somarakis , Rahul Rai , John S. Baras

Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised Learning

Experimental data is often affected by uncontrolled variables that make analysis and interpretation difficult. For spatiotemporal systems, this problem is further exacerbated by their intricate dynamics. Modern machine learning methods are…

Computational Physics · Physics 2020-09-16 Peter Y. Lu , Samuel Kim , Marin Soljačić

Interpretable Latent Spaces for Learning from Demonstration

Effective human-robot interaction, such as in robot learning from human demonstration, requires the learning agent to be able to ground abstract concepts (such as those contained within instructions) in a corresponding high-dimensional…

Computer Vision and Pattern Recognition · Computer Science 2018-10-03 Yordan Hristov , Alex Lascarides , Subramanian Ramamoorthy

From Generative Engines to Actionable Simulators: The Imperative of Physical Grounding in World Models

A world model is an AI system that simulates how an environment evolves under actions, enabling planning through imagined futures rather than reactive perception. Current world models, however, suffer from visual conflation: the mistaken…

Artificial Intelligence · Computer Science 2026-01-23 Zhikang Chen , Tingting Zhu

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

World models have emerged as a critical frontier in AI research, aiming to enhance large models by infusing them with physical dynamics and world knowledge. The core objective is to enable agents to understand, predict, and interact with…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Bohan Zeng , Kaixin Zhu , Daili Hua , Bozhou Li , Chengzhuo Tong , Yuran Wang , Xinyi Huang , Yifan Dai , Zixiang Zhang , Yifan Yang , Zhou Liu , Hao Liang , Xiaochen Ma , Ruichuan An , Tianyi Bai , Hongcheng Gao , Junbo Niu , Yang Shi , Xinlong Chen , Yue Ding , Minglei Shi , Kai Zeng , Yiwen Tang , Yuanxing Zhang , Pengfei Wan , Xintao Wang , Wentao Zhang

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

What does a world model learn from physical exploration, without any linguistic supervision? We argue the answer is organized by a single principle: the geometric structure of the physical world. Training a VAE-based world model on random…

Machine Learning · Computer Science 2026-05-29 Jiayi Fang

Learning Actionable World Models for Industrial Process Control

To go from (passive) process monitoring to active process control, an effective AI system must learn about the behavior of the complex system from very limited training data, forming an ad-hoc digital twin with respect to process inputs and…

Machine Learning · Computer Science 2025-04-28 Peng Yan , Ahmed Abdulkadir , Gerrit A. Schatte , Giulia Aguzzi , Joonsu Gha , Nikola Pascher , Matthias Rosenthal , Yunlong Gao , Benjamin F. Grewe , Thilo Stadelmann

Back to Parsimonious Latents: Learning Task-Centric World Models from Visual Foundations

World models enable agents to predict future dynamics conditioned on actions, making the choice of latent representation central to planning and control. Such representations are often either learned directly from pixels with limited…

Artificial Intelligence · Computer Science 2026-05-26 Minghao Fu , Fan Feng , Nicklas Hansen , Biwei Huang

Aligning Agentic World Models via Knowledgeable Experience Learning

Current Large Language Models (LLMs) exhibit a critical modal disconnect: they possess vast semantic knowledge but lack the procedural grounding to respect the immutable laws of the physical world. Consequently, while these agents…

Computation and Language · Computer Science 2026-01-21 Baochang Ren , Yunzhi Yao , Rui Sun , Shuofei Qiao , Ningyu Zhang , Huajun Chen

Where and What? Examining Interpretable Disentangled Representations

Capturing interpretable variations has long been one of the goals in disentanglement learning. However, unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting.…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Xinqi Zhu , Chang Xu , Dacheng Tao

World Models in Artificial Intelligence: Sensing, Learning, and Reasoning Like a Child

World Models help Artificial Intelligence (AI) predict outcomes, reason about its environment, and guide decision-making. While widely used in reinforcement learning, they lack the structured, adaptive representations that even young…

Artificial Intelligence · Computer Science 2025-03-20 Javier Del Ser , Jesus L. Lobo , Heimo Müller , Andreas Holzinger