Related papers: Object-Oriented Dynamics Learning through Multi-Le…

Object-Oriented Dynamics Predictor

Generalization has been one of the major challenges for learning dynamics models in model-based reinforcement learning. However, previous work on action-conditioned dynamics prediction focuses on learning the pixel-level motion and thus…

Computer Vision and Pattern Recognition · Computer Science 2018-10-31 Guangxiang Zhu , Zhiao Huang , Chongjie Zhang

Multi-Objective Meta Learning

Meta learning with multiple objectives can be formulated as a Multi-Objective Bi-Level optimization Problem (MOBLP) where the upper-level subproblem is to solve several possible conflicting targets for the meta learner. However, existing…

Machine Learning · Computer Science 2021-02-16 Feiyang Ye , Baijiong Lin , Zhixiong Yue , Pengxin Guo , Qiao Xiao , Yu Zhang

Learning Physical Dynamics for Object-centric Visual Prediction

The ability to model the underlying dynamics of visual scenes and reason about the future is central to human intelligence. Many attempts have been made to empower intelligent systems with such physical understanding and prediction…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Huilin Xu , Tao Chen , Feng Xu

Multiple Object Tracking as ID Prediction

Multi-Object Tracking (MOT) has been a long-standing challenge in video understanding. A natural and intuitive approach is to split this task into two parts: object detection and association. Most mainstream methods employ meticulously…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Ruopeng Gao , Ji Qi , Limin Wang

Learning Multi-Object Dynamics with Compositional Neural Radiance Fields

We present a method to learn compositional multi-object dynamics models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks. NeRFs have become a popular choice for…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Danny Driess , Zhiao Huang , Yunzhu Li , Russ Tedrake , Marc Toussaint

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Learning object-centric representations of multi-object scenes is a promising approach towards machine intelligence, facilitating high-level reasoning and control from visual sensory data. However, current approaches for unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Li Nanbo , Cian Eastwood , Robert B. Fisher

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the…

Robotics · Computer Science 2023-03-14 Negin Heravi , Ayzaan Wahid , Corey Lynch , Pete Florence , Travis Armstrong , Jonathan Tompson , Pierre Sermanet , Jeannette Bohg , Debidatta Dwibedi

LAOF: Robust Latent Action Learning with Optical Flow Constraints

Learning latent actions from large-scale videos is crucial for the pre-training of scalable embodied foundation models, yet existing methods often struggle with action-irrelevant distractors. Although incorporating action supervision can…

Robotics · Computer Science 2026-03-24 Xizhou Bu , Jiexi Lyu , Fulei Sun , Ruichen Yang , Zhiqiang Ma , Wei Li

Matching Multiple Perspectives for Efficient Representation Learning

Representation learning approaches typically rely on images of objects captured from a single perspective that are transformed using affine transformations. Additionally, self-supervised learning, a successful paradigm of representation…

Computer Vision and Pattern Recognition · Computer Science 2022-08-17 Omiros Pantazis , Mathew Salvaris

Object-Centric Representation Learning with Generative Spatial-Temporal Factorization

Learning object-centric scene representations is essential for attaining structural understanding and abstraction of complex scenes. Yet, as current approaches for unsupervised object-centric representation learning are built upon either a…

Machine Learning · Computer Science 2021-11-11 Li Nanbo , Muhammad Ahmed Raza , Hu Wenbin , Zhaole Sun , Robert B. Fisher

Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving

Learning-based perception and prediction modules in modern autonomous driving systems typically rely on expensive human annotation and are designed to perceive only a handful of predefined object categories. This closed-set paradigm is…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Mahyar Najibi , Jingwei Ji , Yin Zhou , Charles R. Qi , Xinchen Yan , Scott Ettinger , Dragomir Anguelov

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

In this work, we present a method for tracking and learning the dynamics of all objects in a large scale robot environment. A mobile robot patrols the environment and visits the different locations one by one. Movable objects are discovered…

Robotics · Computer Science 2018-01-30 Nils Bore , Patric Jensfelt , John Folkesson

Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics

Object recognition and motion understanding are key components of perception that complement each other. While self-supervised learning methods have shown promise in their ability to learn from unlabeled data, they have primarily focused on…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Christopher Hoang , Mengye Ren

Predictable MDP Abstraction for Unsupervised Model-Based RL

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions. Errors in this predictive model can degrade the performance of model-based controllers, and complex Markov decision…

Machine Learning · Computer Science 2023-06-06 Seohong Park , Sergey Levine

Learning 3D object-centric representation through prediction

As part of human core knowledge, the representation of objects is the building block of mental representation that supports high-level concepts and symbolic reasoning. While humans develop the ability of perceiving objects situated in 3D…

Computer Vision and Pattern Recognition · Computer Science 2024-03-07 John Day , Tushar Arora , Jirui Liu , Li Erran Li , Ming Bo Cai

Learning Language-Conditioned Deformable Object Manipulation with Graph Dynamics

Multi-task learning of deformable object manipulation is a challenging problem in robot manipulation. Most previous works address this problem in a goal-conditioned way and adapt goal images to specify different tasks, which limits the…

Robotics · Computer Science 2024-01-30 Yuhong Deng , Kai Mo , Chongkun Xia , Xueqian Wang

Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction

In the face of difficult exploration problems in reinforcement learning, we study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning. We found this problem is…

Machine Learning · Computer Science 2025-04-15 Anthony GX-Chen , Kenneth Marino , Rob Fergus

UTOPIA: Unconstrained Tracking Objects without Preliminary Examination via Cross-Domain Adaptation

Multiple Object Tracking (MOT) aims to find bounding boxes and identities of targeted objects in consecutive video frames. While fully-supervised MOT methods have achieved high accuracy on existing datasets, they cannot generalize well on a…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Pha Nguyen , Kha Gia Quach , John Gauch , Samee U. Khan , Bhiksha Raj , Khoa Luu

Enhancing Embodied Object Detection through Language-Image Pre-training and Implicit Object Memory

Deep-learning and large scale language-image training have produced image object detectors that generalise well to diverse environments and semantic classes. However, single-image object detectors trained on internet data are not optimally…

Robotics · Computer Science 2024-02-07 Nicolas Harvey Chapman , Feras Dayoub , Will Browne , Chris Lehnert

Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Detection and segmentation of moving obstacles, along with prediction of the future occupancy states of the local environment, are essential for autonomous vehicles to proactively make safe and informed decisions. In this paper, we propose…

Robotics · Computer Science 2022-09-28 Maneekwan Toyungyernsub , Esen Yel , Jiachen Li , Mykel J. Kochenderfer