English
Related papers

Related papers: 3D-OES: Viewpoint-Invariant Object-Factorized Envi…

200 papers

Humans have a strong intuitive understanding of the 3D environment around us. The mental model of the physics in our brain applies to objects of different materials and enables us to perform a wide range of manipulation tasks that are far…

Robotics · Computer Science 2021-11-15 Yunzhu Li , Shuang Li , Vincent Sitzmann , Pulkit Agrawal , Antonio Torralba

Given a visual scene, humans have strong intuitions about how a scene can evolve over time under given actions. The intuition, often termed visual intuitive physics, is a critical ability that allows us to make effective plans to manipulate…

Computer Vision and Pattern Recognition · Computer Science 2023-04-25 Haotian Xue , Antonio Torralba , Joshua B. Tenenbaum , Daniel LK Yamins , Yunzhu Li , Hsiao-Yu Tung

A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment. Many existing methods for learning the dynamics of physical interactions require labeled object information.…

Machine Learning · Computer Science 2016-10-19 Chelsea Finn , Ian Goodfellow , Sergey Levine

Machines that can predict the effect of physical interactions on the dynamics of previously unseen object instances are important for creating better robots and interactive virtual worlds. In this work, we focus on predicting the dynamics…

Computer Vision and Pattern Recognition · Computer Science 2020-01-20 Davis Rempe , Srinath Sridhar , He Wang , Leonidas J. Guibas

The ability to simulate the effects of future actions on the world is a crucial ability of intelligent embodied agents, enabling agents to anticipate the effects of their actions and make plans accordingly. While a large body of existing…

Computer Vision and Pattern Recognition · Computer Science 2025-05-12 Siyuan Zhou , Yilun Du , Yuncong Yang , Lei Han , Peihao Chen , Dit-Yan Yeung , Chuang Gan

We propose a system that learns to detect objects and infer their 3D poses in RGB-D images. Many existing systems can identify objects and infer 3D poses, but they heavily rely on human labels and 3D annotations. The challenge here is to…

Computer Vision and Pattern Recognition · Computer Science 2020-11-02 Mihir Prabhudesai , Shamit Lal , Hsiao-Yu Fish Tung , Adam W. Harley , Shubhankar Potdar , Katerina Fragkiadaki

Videos of robots interacting with objects encode rich information about the objects' dynamics. However, existing video prediction approaches typically do not explicitly account for the 3D information from videos, such as robot actions and…

Robotics · Computer Science 2024-10-25 Mingtong Zhang , Kaifeng Zhang , Yunzhu Li

Human perception involves decomposing complex multi-object scenes into time-static object appearance (i.e., size, shape, color) and time-varying object motion (i.e., position, velocity, acceleration). For machines to achieve human-like…

Computer Vision and Pattern Recognition · Computer Science 2025-07-22 Yeon-Ji Song , Jaein Kim , Suhyung Choi , Jin-Hwa Kim , Byoung-Tak Zhang

Humans can effortlessly anticipate how objects might move or change through interaction--imagining a cup being lifted, a knife slicing, or a lid being closed. We aim to endow computational systems with a similar ability to predict plausible…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Rustin Soraki , Homanga Bharadhwaj , Ali Farhadi , Roozbeh Mottaghi

The ability to model the underlying dynamics of visual scenes and reason about the future is central to human intelligence. Many attempts have been made to empower intelligent systems with such physical understanding and prediction…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Huilin Xu , Tao Chen , Feng Xu

Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelligent and anticipatory actions. This paper presents a perception framework that fuses visual and…

Machine Learning · Computer Science 2021-01-21 Sahand Rezaei-Shoshtari , Francois Robert Hogan , Michael Jenkin , David Meger , Gregory Dudek

We tackle the task of scalable unsupervised object-centric representation learning on 3D scenes. Existing approaches to object-centric representation learning show limitations in generalizing to larger scenes as their learning processes…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Tianyu Wang , Kee Siong Ng , Miaomiao Liu

Learned visual dynamics models have proven effective for robotic manipulation tasks. Yet, it remains unclear how best to represent scenes involving multi-object interactions. Current methods decompose a scene into discrete objects, but they…

We propose a method for incorporating object interaction and human body dynamics into the task of 3D ego-pose estimation using a head-mounted camera. We use a kinematics model of the human body to represent the entire range of human motion,…

Computer Vision and Pattern Recognition · Computer Science 2020-12-10 Zhengyi Luo , Ryo Hachiuma , Ye Yuan , Shun Iwase , Kris M. Kitani

Forecasting a typical object's future motion is a critical task for interpreting and interacting with dynamic environments in computer vision. Event-based sensors, which could capture changes in the scene with exceptional temporal…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Song Wu , Zhiyu Zhu , Junhui Hou , Guangming Shi , Jinjian Wu

We present a method to map 2D image observations of a scene to a persistent 3D scene representation, enabling novel view synthesis and disentangled representation of the movable and immovable components of the scene. Motivated by the…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Prafull Sharma , Ayush Tewari , Yilun Du , Sergey Zakharov , Rares Ambrus , Adrien Gaidon , William T. Freeman , Fredo Durand , Joshua B. Tenenbaum , Vincent Sitzmann

We hypothesize that an agent that can look around in static scenes can learn rich visual representations applicable to 3D object tracking in complex dynamic scenes. We are motivated in this pursuit by the fact that the physical world itself…

Computer Vision and Pattern Recognition · Computer Science 2020-08-05 Adam W. Harley , Shrinidhi K. Lakshmikanth , Paul Schydlo , Katerina Fragkiadaki

Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built. Recent work on simple 2D and 3D datasets has shown that models…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Thomas Kipf , Gamaleldin F. Elsayed , Aravindh Mahendran , Austin Stone , Sara Sabour , Georg Heigold , Rico Jonschkowski , Alexey Dosovitskiy , Klaus Greff

Humans navigate in their environment by learning a mental model of the world through passive observation and active interaction. Their world model allows them to anticipate what might happen next and act accordingly with respect to an…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Anthony Hu

Neural networks have recently been used to analyze diverse physical systems and to identify the underlying dynamics. While existing methods achieve impressive results, they are limited by their strong demand for training data and their weak…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Florian Hofherr , Lukas Koestler , Florian Bernard , Daniel Cremers
‹ Prev 1 2 3 10 Next ›