Related papers: Unsupervised Object-Centric Learning from Multiple…

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

Visual scenes are extremely rich in diversity, not only because there are infinite combinations of objects and background, but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Jinyang Yuan , Bin Li , Xiangyang Xue

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of objects promises to significantly improve the generalization abilities of current machine learning models. While recent efforts to this end have shown promising empirical…

Machine Learning · Computer Science 2023-05-24 Jack Brady , Roland S. Zimmermann , Yash Sharma , Bernhard Schölkopf , Julius von Kügelgen , Wieland Brendel

Compositional Scene Modeling with Global Object-Centric Representations

The appearance of the same object may vary in different scene images due to perspectives and occlusions between objects. Humans can easily identify the same object, even if occlusions exist, by completing the occluded parts based on its…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 Tonglin Chen , Bin Li , Zhimeng Shen , Xiangyang Xue

Improving Viewpoint-Independent Object-Centric Representations through Active Viewpoint Selection

Given the complexities inherent in visual scenes, such as object occlusion, a comprehensive understanding often requires observation from multiple viewpoints. Existing multi-viewpoint object-centric learning methods typically employ random…

Computer Vision and Pattern Recognition · Computer Science 2024-11-04 Yinxuan Huang , Chengmin Gao , Bin Li , Xiangyang Xue

Multi-Object Representation Learning with Iterative Variational Inference

Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even…

Machine Learning · Computer Science 2020-07-29 Klaus Greff , Raphaël Lopez Kaufman , Rishabh Kabra , Nick Watters , Chris Burgess , Daniel Zoran , Loic Matthey , Matthew Botvinick , Alexander Lerchner

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Autonomous agents need large repertoires of skills to act reasonably on new tasks that they have not seen before. However, acquiring these skills using only a stream of high-dimensional, unstructured, and unlabeled observations is a tricky…

Machine Learning · Computer Science 2021-02-09 Andrii Zadaianchuk , Maximilian Seitzer , Georg Martius

SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is…

Computer Vision and Pattern Recognition · Computer Science 2021-12-07 Rishabh Kabra , Daniel Zoran , Goker Erdogan , Loic Matthey , Antonia Creswell , Matthew Botvinick , Alexander Lerchner , Christopher P. Burgess

Unsupervised Part-Based Disentangling of Object Shape and Appearance

Large intra-class variation is the result of changes in multiple object characteristics. Images, however, only show the superposition of different variable factors such as appearance or shape. Therefore, learning to disentangle and…

Computer Vision and Pattern Recognition · Computer Science 2019-06-18 Dominik Lorenz , Leonard Bereska , Timo Milbich , Björn Ommer

Unsupervised object-centric video generation and decomposition in 3D

A natural approach to generative modeling of videos is to represent them as a composition of moving objects. Recent works model a set of 2D sprites over a slowly-varying background, but without considering the underlying 3D scene that gives…

Computer Vision and Pattern Recognition · Computer Science 2021-03-26 Paul Henderson , Christoph H. Lampert

Unsupervised Learning of Object Structure and Dynamics from Videos

Extracting and predicting object structure and dynamics from videos without supervision is a major challenge in machine learning. To address this challenge, we adopt a keypoint-based image representation and learn a stochastic dynamics…

Computer Vision and Pattern Recognition · Computer Science 2020-03-03 Matthias Minderer , Chen Sun , Ruben Villegas , Forrester Cole , Kevin Murphy , Honglak Lee

Time-Conditioned Generative Modeling of Object-Centric Representations for Video Decomposition and Prediction

When perceiving the world from multiple viewpoints, humans have the ability to reason about the complete objects in a compositional manner even when an object is completely occluded from certain viewpoints. Meanwhile, humans are able to…

Computer Vision and Pattern Recognition · Computer Science 2023-10-27 Chengmin Gao , Bin Li

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

The ability to decompose complex multi-object scenes into meaningful abstractions like objects is fundamental to achieve higher-level cognition. Previous approaches for unsupervised object-oriented scene representation learning are either…

Machine Learning · Computer Science 2020-03-17 Zhixuan Lin , Yi-Fu Wu , Skand Vishwanath Peri , Weihao Sun , Gautam Singh , Fei Deng , Jindong Jiang , Sungjin Ahn

Learning Features by Watching Objects Move

This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation.…

Computer Vision and Pattern Recognition · Computer Science 2017-04-13 Deepak Pathak , Ross Girshick , Piotr Dollár , Trevor Darrell , Bharath Hariharan

Variational Inference for Scalable 3D Object-centric Learning

We tackle the task of scalable unsupervised object-centric representation learning on 3D scenes. Existing approaches to object-centric representation learning show limitations in generalizing to larger scenes as their learning processes…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Tianyu Wang , Kee Siong Ng , Miaomiao Liu

Self-Supervision by Prediction for Object Discovery in Videos

Despite their irresistible success, deep learning algorithms still heavily rely on annotated data. On the other hand, unsupervised settings pose many challenges, especially about determining the right inductive bias in diverse scenarios.…

Computer Vision and Pattern Recognition · Computer Science 2021-03-11 Beril Besbinar , Pascal Frossard

Unsupervised learning from video to detect foreground objects in single images

Unsupervised learning from visual data is one of the most difficult challenges in computer vision, being a fundamental task for understanding how visual recognition works. From a practical point of view, learning from unsupervised visual…

Computer Vision and Pattern Recognition · Computer Science 2017-04-03 Ioana Croitoru , Simion-Vlad Bogolin , Marius Leordeanu

ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation

Understanding the 3D world without supervision is currently a major challenge in computer vision as the annotations required to supervise deep networks for tasks in this domain are expensive to obtain on a large scale. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Octave Mariotti , Oisin Mac Aodha , Hakan Bilen

Exploiting Spatial Invariance for Scalable Unsupervised Object Tracking

The ability to detect and track objects in the visual world is a crucial skill for any intelligent agent, as it is a necessary precursor to any object-level reasoning process. Moreover, it is important that agents learn to track objects…

Machine Learning · Computer Science 2019-11-21 Eric Crawford , Joelle Pineau

Learning Global Object-Centric Representations via Disentangled Slot Attention

Humans can discern scene-independent features of objects across various environments, allowing them to swiftly identify objects amidst changing factors such as lighting, perspective, size, and position and imagine the complete images of the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Tonglin Chen , Yinxuan Huang , Zhimeng Shen , Jinghao Huang , Bin Li , Xiangyang Xue

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the…

Robotics · Computer Science 2023-03-14 Negin Heravi , Ayzaan Wahid , Corey Lynch , Pete Florence , Travis Armstrong , Jonathan Tompson , Pierre Sermanet , Jeannette Bohg , Debidatta Dwibedi