English
Related papers

Related papers: VectorWorld: Efficient Streaming World Model via D…

200 papers

Recent successes in autoregressive (AR) generation models, such as the GPT series in natural language processing, have motivated efforts to replicate this success in visual tasks. Some works attempt to extend this approach to autonomous…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Xiaotao Hu , Wei Yin , Mingkai Jia , Junyuan Deng , Xiaoyang Guo , Qian Zhang , Xiaoxiao Long , Ping Tan

Video generation models, as one form of world models, have emerged as one of the most exciting frontiers in AI, promising agents the ability to imagine the future by modeling the temporal evolution of complex scenes. In autonomous driving,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Yang Zhou , Hao Shao , Letian Wang , Zhuofan Zong , Hongsheng Li , Steven L. Waslander

Action-conditioned video prediction models (often referred to as world models) have shown strong potential for robotics applications, but existing approaches are often slow and struggle to capture physically consistent interactions over…

Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision--language--action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Chaoda Zheng , Sean Li , Jinhao Deng , Zhennan Wang , Shijia Chen , Liqiang Xiao , Ziheng Chi , Hongbin Lin , Kangjie Chen , Boyang Wang , Yu Zhang , Xianming Liu

Generative video models, a leading approach to world modeling, face fundamental limitations. They often violate physical and logical rules, lack interactivity, and operate as opaque black boxes ill-suited for building structured, queryable…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Felix O'Mahony , Roberto Cipolla , Ayush Tewari

Generative driving world models rely on compact latent state representations that must be efficiently transmitted and synchronized across distributed compute and connected vehicles. We study network-efficient streaming of a discrete world…

Robotics · Computer Science 2026-05-12 Shatadal Mishra , Ahmadreza Moradipari , Nejib Ammar

Generative world models (WMs) can now simulate worlds with striking visual realism, which naturally raises the question of whether they can endow embodied agents with predictive perception for decision making. Progress on this question has…

Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current state-of-the-art video models still…

Pretrained foundation models have become an important basis for end-to-end autonomous driving. In contrast to vision-language models pretrained primarily on static image-text pairs, video generative models capture temporal dynamics and…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Chen Shi , Jinrui Xu , Shaoshuai Shi , Kehua Sheng , Bo Zhang , Li Jiang

World models, especially in autonomous driving, are trending and drawing extensive attention due to their capacity for comprehending driving environments. The established world model holds immense potential for the generation of…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Xiaofeng Wang , Zheng Zhu , Guan Huang , Xinze Chen , Jiagang Zhu , Jiwen Lu

With the rapid advancement of autonomous driving technology, a lack of data has become a major obstacle to enhancing perception model accuracy. Researchers are now exploring controllable data generation using world models to diversify…

Computer Vision and Pattern Recognition · Computer Science 2025-06-27 Xinqing Li , Ruiqi Song , Qingyu Xie , Ye Wu , Nanxin Zeng , Yunfeng Ai

End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we…

Computer Vision and Pattern Recognition · Computer Science 2025-02-14 Ziyang Yan , Wenzhen Dong , Yihua Shao , Yuhang Lu , Liu Haiyang , Jingwen Liu , Haozhe Wang , Zhe Wang , Yan Wang , Fabio Remondino , Yuexin Ma

Recently, world models have been incorporated into the autonomous driving systems to improve the planning reliability. Existing approaches typically predict future states through appearance generation or deterministic regression, which…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Xiaolu Liu , Yicong Li , Song Wang , Junbo Chen , Angela Yao , Jianke Zhu

World models have attracted increasing attention in autonomous driving for their ability to forecast potential future scenarios. In this paper, we propose BEVWorld, a novel framework that transforms multimodal sensor inputs into a unified…

Computer Vision and Pattern Recognition · Computer Science 2025-05-01 Yumeng Zhang , Shi Gong , Kaixin Xiong , Xiaoqing Ye , Xiaofan Li , Xiao Tan , Fan Wang , Jizhou Huang , Hua Wu , Haifeng Wang

World models aim to endow AI systems with the ability to represent, generate, and interact with dynamic environments in a coherent and temporally consistent manner. While recent video generation models have demonstrated impressive visual…

Dynamical systems theory and reinforcement learning view world evolution as latent-state dynamics driven by actions, with visual observations providing partial information about the state. Recent video world models attempt to learn this…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Zhen Li , Zian Meng , Shuwei Shi , Wenshuo Peng , Yuwei Wu , Bo Zheng , Chuanhao Li , Kaipeng Zhang

World models have gained significant attention as a promising approach for autonomous driving. By emulating human-like perception and decision-making processes, these models can predict and adapt to dynamic environments. Existing methods…

Robotics · Computer Science 2025-12-03 Huiqian Li , Wei Pan , Haodong Zhang , Jin Huang , Zhihua Zhong

Autonomous driving requires reasoning about how the environment evolves and planning actions accordingly. Existing world-model-based approaches typically predict future scenes first and plan afterwards, resulting in open-loop imagination…

Robotics · Computer Science 2026-03-31 Qiqi Liu , Huan Xu , Jingyu Li , Bin Sun , Zhihui Hao , Dangen She , Xiatian Zhu , Li Zhang

Multi-agent traffic simulation is central to developing and testing autonomous driving systems. Recent data-driven simulators have achieved promising results, but rely heavily on supervised learning from labeled trajectories or semantic…

Robotics · Computer Science 2026-04-01 Mozhgan Pourkeshavatz , Tianran Liu , Nicholas Rhinehart

Web agents require massive trajectories to generalize, yet real-world training is constrained by network latency, rate limits, and safety risks. We introduce \textbf{WebWorld} series, the first open-web simulator trained at scale. While…

Artificial Intelligence · Computer Science 2026-02-17 Zikai Xiao , Jianhong Tu , Chuhang Zou , Yuxin Zuo , Zhi Li , Peng Wang , Bowen Yu , Fei Huang , Junyang Lin , Zuozhu Liu
‹ Prev 1 2 3 10 Next ›