English
Related papers

Related papers: Exploration-Driven Generative Interactive Environm…

200 papers

World models are increasingly pivotal in interpreting and simulating the rules and actions of complex environments. Genie, a recent model, excels at learning from visually diverse environments but relies on costly human-collected data. We…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Naser Kazemi , Nedko Savov , Danda Paudel , Luc Van Gool

We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described…

Autonomous artificial agents must be able to learn behaviors in complex environments without humans to design tasks and rewards. Designing these functions for each environment is not feasible, thus, motivating the development of intrinsic…

Machine Learning · Computer Science 2025-02-20 Alana Santana , Paula P. Costa , Esther L. Colombini

Developing autonomous agents that quickly explore an environment and adapt their behavior online is a canonical challenge in robotics and machine learning. While humans are able to achieve such fast online exploration and adaptation, often…

Machine Learning · Computer Science 2025-07-15 Andrew Wagenmaker , Zhiyuan Zhou , Sergey Levine

Mobile GUI agents show promise in automating tasks but face generalization challenges in diverse real-world scenarios. Traditional approaches using pre-training or fine-tuning with massive datasets struggle with the diversity of mobile…

Human-Computer Interaction · Computer Science 2025-04-21 Guangyi Liu , Pengxiang Zhao , Liang Liu , Zhiming Chen , Yuxiang Chai , Shuai Ren , Hao Wang , Shibo He , Wenchao Meng

Humans naturally adapt to diverse environments by learning underlying rules across worlds with different dynamics, observations, and reward structures. In contrast, existing agents typically demonstrate improvements via self-evolving within…

The rapid progress of large language models (LLMs) has sparked growing interest in building Artificial General Intelligence (AGI) within Graphical User Interface (GUI) environments. However, existing GUI agents based on LLMs or…

Artificial Intelligence · Computer Science 2025-05-27 Runliang Niu , Jinglong Ji , Yi Chang , Qi Wang

Modern films, games and virtual reality applications are dependent on convincing computer graphics. Highly complex models are a requirement for the successful delivery of many scenes and environments. While workflows such as rendering,…

Neural and Evolutionary Computing · Computer Science 2016-04-21 Jan Kruse , Andy M. Connor

Infants are experts at playing, with an amazing ability to generate novel structured behaviors in unstructured environments that lack clear extrinsic reward signals. We seek to mathematically formalize these abilities using a neural network…

Machine Learning · Computer Science 2018-11-01 Nick Haber , Damian Mrowca , Li Fei-Fei , Daniel L. K. Yamins

We introduce GenAgent, unifying visual understanding and generation through an agentic multimodal model. Unlike unified models that face expensive training costs and understanding-generation trade-offs, GenAgent decouples these capabilities…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 Kaixun Jiang , Yuzheng Wang , Junjie Zhou , Pandeng Li , Zhihang Liu , Chen-Wei Xie , Zhaoyu Chen , Yun Zheng , Wenqiang Zhang

Understanding, navigating, and exploring the 3D physical real world has long been a central challenge in the development of artificial intelligence. In this work, we take a step toward this goal by introducing GenEx, a system capable of…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Taiming Lu , Tianmin Shu , Junfei Xiao , Luoxin Ye , Jiahao Wang , Cheng Peng , Chen Wei , Daniel Khashabi , Rama Chellappa , Alan Yuille , Jieneng Chen

Video generation models, as one form of world models, have emerged as one of the most exciting frontiers in AI, promising agents the ability to imagine the future by modeling the temporal evolution of complex scenes. In autonomous driving,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Yang Zhou , Hao Shao , Letian Wang , Zhuofan Zong , Hongsheng Li , Steven L. Waslander

With the rapid development of Large Vision Language Models, the focus of Graphical User Interface (GUI) agent tasks shifts from single-screen tasks to complex screen navigation challenges. However, real-world GUI environments, such as PC…

Computer Vision and Pattern Recognition · Computer Science 2025-12-03 Haolong Yan , Yeqing Shen , Xin Huang , Jia Wang , Kaijun Tan , Zhixuan Liang , Hongxin Li , Zheng Ge , Osamu Yoshie , Si Li , Xiangyu Zhang , Daxin Jiang

Planning with partial observation is a central challenge in embodied AI. A majority of prior works have tackled this challenge by developing agents that physically explore their environment to update their beliefs about the world state. In…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Taiming Lu , Tianmin Shu , Alan Yuille , Daniel Khashabi , Jieneng Chen

Learning an agent model that behaves like humans-capable of jointly perceiving the environment, predicting the future, and taking actions from a first-person perspective-is a fundamental challenge in computer vision. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2025-09-12 Lu Chen , Yizhou Wang , Shixiang Tang , Qianhong Ma , Tong He , Wanli Ouyang , Xiaowei Zhou , Hujun Bao , Sida Peng

Large-scale, high-quality interaction trajectories are essential for advancing mobile Graphical User Interface (GUI) agents. While existing methods typically rely on labor-intensive human demonstrations or automated model exploration to…

Artificial Intelligence · Computer Science 2026-02-02 Linjia Kang , Zhimin Wang , Yongkang Zhang , Duo Wu , Jinghe Wang , Ming Ma , Haopeng Yan , Zhi Wang

Understanding human behavior in built environments is critical for designing functional, user centered urban spaces. Traditional approaches, such as manual observations, surveys, and simplified simulations, often fail to capture the…

Artificial Intelligence · Computer Science 2024-12-30 Ariel Noyman , Kai Hu , Kent Larson

Agents powered by large language models have shown remarkable abilities in solving complex tasks. However, most agent systems remain reactive, limiting their effectiveness in scenarios requiring foresight and autonomous decision-making. In…

Multi-agent reinforcement learning faces fundamental challenges that conventional approaches have failed to overcome: exponentially growing joint action spaces, non-stationary environments where simultaneous learning creates moving targets,…

Artificial Intelligence · Computer Science 2025-07-15 Hang Wang , Junshan Zhang

Standard reinforcement learning (RL) for large language model (LLM) agents typically optimizes extrinsic rewards, prioritizing isolated task completion over continual adaptation. Consequently, agents often converge to suboptimal policies…

Artificial Intelligence · Computer Science 2026-03-31 Xiaoying Zhang , Zichen Liu , Yipeng Zhang , Xia Hu , Wenqi Shao
‹ Prev 1 2 3 10 Next ›