Related papers: Self-supervised Visual Reinforcement Learning with…

Self-supervised Reinforcement Learning with Independently Controllable Subgoals

To successfully tackle challenging manipulation tasks, autonomous agents must learn a diverse set of skills and how to combine them. Recently, self-supervised agents that set their own abstract goals by exploiting the discovered structure…

Machine Learning · Computer Science 2022-02-01 Andrii Zadaianchuk , Georg Martius , Fanny Yang

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the…

Robotics · Computer Science 2023-03-14 Negin Heravi , Ayzaan Wahid , Corey Lynch , Pete Florence , Travis Armstrong , Jonathan Tompson , Pierre Sermanet , Jeannette Bohg , Debidatta Dwibedi

Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Whie Jung , Jaehoon Yoo , Sungjin Ahn , Seunghoon Hong

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of objects promises to significantly improve the generalization abilities of current machine learning models. While recent efforts to this end have shown promising empirical…

Machine Learning · Computer Science 2023-05-24 Jack Brady , Roland S. Zimmermann , Yash Sharma , Bernhard Schölkopf , Julius von Kügelgen , Wieland Brendel

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

Visual scenes are extremely diverse, not only because there are infinite possible combinations of objects and backgrounds but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Jinyang Yuan , Tonglin Chen , Zhimeng Shen , Bin Li , Xiangyang Xue

Relational Object-Centric Actor-Critic

The advances in unsupervised object-centric representation learning have significantly improved its application to downstream tasks. Recent works highlight that disentangled object representations can aid policy learning in image-based,…

Artificial Intelligence · Computer Science 2025-03-21 Leonid Ugadiarov , Vitaliy Vorobyov , Aleksandr I. Panov

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

Visual scenes are extremely rich in diversity, not only because there are infinite combinations of objects and background, but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Jinyang Yuan , Bin Li , Xiangyang Xue

Learning Actionable Representations from Visual Observations

In this work we explore a new approach for robots to teach themselves about the world simply by observing it. In particular we investigate the effectiveness of learning task-agnostic representations for continuous control tasks. We extend…

Computer Vision and Pattern Recognition · Computer Science 2019-02-05 Debidatta Dwibedi , Jonathan Tompson , Corey Lynch , Pierre Sermanet

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling…

Robotics · Computer Science 2018-11-20 Eric Jang , Coline Devin , Vincent Vanhoucke , Sergey Levine

Visual Reinforcement Learning with Imagined Goals

For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires. Furthermore, to provide the requisite level of generality, these skills…

Machine Learning · Computer Science 2018-12-05 Ashvin Nair , Vitchyr Pong , Murtaza Dalal , Shikhar Bahl , Steven Lin , Sergey Levine

Robust and Controllable Object-Centric Learning through Energy-based Models

Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability to decompose low-level observations into discrete objects allows us to build a grounded abstract representation and identify the…

Machine Learning · Computer Science 2022-10-12 Ruixiang Zhang , Tong Che , Boris Ivanovic , Renhao Wang , Marco Pavone , Yoshua Bengio , Liam Paull

Object-Centric Action-Enhanced Representations for Robot Visuo-Motor Policy Learning

Learning visual representations from observing actions to benefit robot visuo-motor policy generation is a promising direction that closely resembles human cognitive function and perception. Motivated by this, and further inspired by…

Robotics · Computer Science 2025-05-28 Nikos Giannakakis , Argyris Manetas , Panagiotis P. Filntisis , Petros Maragos , George Retsinas

Learning Global Object-Centric Representations via Disentangled Slot Attention

Humans can discern scene-independent features of objects across various environments, allowing them to swiftly identify objects amidst changing factors such as lighting, perspective, size, and position and imagine the complete images of the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Tonglin Chen , Yinxuan Huang , Zhimeng Shen , Jinghao Huang , Bin Li , Xiangyang Xue

Deep Object-Centric Representations for Generalizable Robot Learning

Robotic manipulation in complex open-world scenarios requires both reliable physical manipulation skills and effective and generalizable perception. In this paper, we propose a method where general purpose pretrained visual models serve as…

Robotics · Computer Science 2017-09-27 Coline Devin , Pieter Abbeel , Trevor Darrell , Sergey Levine

Zero-Shot Object-Centric Representation Learning

The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities. Recent successes have shown that object-centric representation learning can be scaled to…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Aniket Didolkar , Andrii Zadaianchuk , Anirudh Goyal , Mike Mozer , Yoshua Bengio , Georg Martius , Maximilian Seitzer

Successes and Limitations of Object-centric Models at Compositional Generalisation

In recent years, it has been shown empirically that standard disentangled latent variable models do not support robust compositional learning in the visual domain. Indeed, in spite of being designed with the goal of factorising datasets…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Milton L. Montero , Jeffrey S. Bowers , Gaurav Malhotra

Agent-centric learning: from external reward maximization to internal knowledge curation

The pursuit of general intelligence has traditionally centered on external objectives: an agent's control over its environments or mastery of specific tasks. This external focus, however, can produce specialized agents that lack…

Machine Learning · Computer Science 2025-07-31 Hanqi Zhou , Fryderyk Mantiuk , David G. Nagy , Charley M. Wu

Towards Self-Supervised Learning of Global and Object-Centric Representations

Self-supervision allows learning meaningful representations of natural images, which usually contain one central object. How well does it transfer to multi-entity scenes? We discuss key aspects of learning structured object-centric…

Computer Vision and Pattern Recognition · Computer Science 2022-04-15 Federico Baldassarre , Hossein Azizpour

Object-centric proto-symbolic behavioural reasoning from pixels

Autonomous intelligent agents must bridge computational challenges at disparate levels of abstraction, from the low-level spaces of sensory input and motor commands to the high-level domain of abstract reasoning and planning. A key question…

Artificial Intelligence · Computer Science 2025-12-12 Ruben van Bergen , Justus Hübotter , Alma Lago , Pablo Lanillos

Object-centric architectures enable efficient causal representation learning

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that…

Machine Learning · Computer Science 2023-10-31 Amin Mansouri , Jason Hartford , Yan Zhang , Yoshua Bengio