English

Physically Interpretable World Models via Weakly Supervised Representation Learning

Machine Learning 2026-04-07 v6

Abstract

Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty naturally arising from real-world sensing pipelines. The architecture integrates a VQ-based visual encoder, a transformer-based physical encoder, and a learnable dynamics model grounded in known physical equations. Across three case studies (Cart Pole, Lunar Lander, and Donkey Car), PIWM achieves accurate long-horizon prediction, recovers true system parameters, and significantly improves physical grounding over purely data-driven models. These results demonstrate the feasibility and advantages of learning physically interpretable world models directly from images under weak supervision.

Keywords

Cite

@article{arxiv.2412.12870,
  title  = {Physically Interpretable World Models via Weakly Supervised Representation Learning},
  author = {Zhenjiang Mao and Mrinall Eashaan Umasudhan and Ivan Ruchkin},
  journal= {arXiv preprint arXiv:2412.12870},
  year   = {2026}
}