Related papers: Deep Object-Centric Representations for Generaliza…

Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection

General-purpose robotic manipulation, including reach and grasp, is essential for deployment into households and workspaces involving diverse and evolving tasks. Recent advances propose using large pre-trained models, such as Large Language…

Robotics · Computer Science 2025-07-16 Huiyi Wang , Fahim Shahriar , Alireza Azimi , Gautham Vasan , Rupam Mahmood , Colin Bellinger

Is an object-centric representation beneficial for robotic manipulation ?

Object-centric representation (OCR) has recently become a subject of interest in the computer vision community for learning a structured representation of images and videos. It has been several times presented as a potential way to improve…

Artificial Intelligence · Computer Science 2025-06-25 Alexandre Chapin , Emmanuel Dellandrea , Liming Chen

Disentangled Object-Centric Image Representation for Robotic Manipulation

Learning robotic manipulation skills from vision is a promising approach for developing robotics applications that can generalize broadly to real-world scenarios. As such, many approaches to enable this vision have been explored with…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 David Emukpere , Romain Deffayet , Bingbing Wu , Romain Brégier , Michael Niemaz , Jean-Luc Meunier , Denys Proux , Jean-Michel Renders , Seungsu Kim

Object-Centric Representations Improve Policy Generalization in Robot Manipulation

Visual representations are central to the learning and generalization capabilities of robotic manipulation policies. While existing methods rely on global or dense features, such representations often entangle task-relevant and irrelevant…

Robotics · Computer Science 2025-05-20 Alexandre Chapin , Bruno Machado , Emmanuel Dellandrea , Liming Chen

Vision-based Robot Manipulation Learning via Human Demonstrations

Vision-based learning methods provide promise for robots to learn complex manipulation tasks. However, how to generalize the learned manipulation skills to real-world interactions remains an open question. In this work, we study robotic…

Robotics · Computer Science 2020-03-03 Zhixin Jia , Mengxiang Lin , Zhixin Chen , Shibo Jian

Discovering Object-Centric Generalized Value Functions From Pixels

Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an…

Machine Learning · Computer Science 2023-06-28 Somjit Nath , Gopeshh Raaj Subbaraj , Khimya Khetarpal , Samira Ebrahimi Kahou

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We…

Robotics · Computer Science 2018-12-04 Frederik Ebert , Chelsea Finn , Sudeep Dasari , Annie Xie , Alex Lee , Sergey Levine

Object-centric Forward Modeling for Model Predictive Control

We present an approach to learn an object-centric forward model, and show that this allows us to plan for sequences of actions to achieve distant desired goals. We propose to model a scene as a collection of objects, each with an explicit…

Computer Vision and Pattern Recognition · Computer Science 2019-10-09 Yufei Ye , Dhiraj Gandhi , Abhinav Gupta , Shubham Tulsiani

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling…

Robotics · Computer Science 2018-11-20 Eric Jang , Coline Devin , Vincent Vanhoucke , Sergey Levine

Zero-Shot Object-Centric Representation Learning

The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities. Recent successes have shown that object-centric representation learning can be scaled to…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Aniket Didolkar , Andrii Zadaianchuk , Anirudh Goyal , Mike Mozer , Yoshua Bengio , Georg Martius , Maximilian Seitzer

Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation

What is the right object representation for manipulation? We would like robots to visually perceive scenes and learn an understanding of the objects in them that (i) is task-agnostic and can be used as a building block for a variety of…

Robotics · Computer Science 2018-09-10 Peter R. Florence , Lucas Manuelli , Russ Tedrake

Generalizing Object-Centric Task-Axes Controllers using Keypoints

To perform manipulation tasks in the real world, robots need to operate on objects with various shapes, sizes and without access to geometric models. It is often unfeasible to train monolithic neural network policies across such large…

Robotics · Computer Science 2021-03-22 Mohit Sharma , Oliver Kroemer

FOCUS: Object-Centric World Models for Robotics Manipulation

Understanding the world in terms of objects and the possible interplays with them is an important cognition ability, especially in robotics manipulation, where many tasks require robot-object interactions. However, learning such a…

Robotics · Computer Science 2023-07-10 Stefano Ferraro , Pietro Mazzaglia , Tim Verbelen , Bart Dhoedt

Guided Visual Attention Model Based on Interactions Between Top-down and Bottom-up Information for Robot Pose Prediction

Deep robot vision models are widely used for recognizing objects from camera images, but shows poor performance when detecting objects at untrained positions. Although such problem can be alleviated by training with large datasets, the…

Robotics · Computer Science 2022-10-26 Hyogo Hiruma , Hiroki Mori , Hiroshi Ito , Tetsuya Ogata

Learning Manipulation Skills Via Hierarchical Spatial Attention

Learning generalizable skills in robotic manipulation has long been challenging due to real-world sized observation and action spaces. One method for addressing this problem is attention focus -- the robot learns where to attend its sensors…

Robotics · Computer Science 2020-03-05 Marcus Gualtieri , Robert Platt

Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

To aid humans in everyday tasks, robots need to know which objects exist in the scene, where they are, and how to grasp and manipulate them in different situations. Therefore, object recognition and grasping are two key functionalities for…

Robotics · Computer Science 2022-12-07 Hamidreza Kasaei , Sha Luo , Remo Sasso , Mohammadreza Kasaei

Object-Centric Action-Enhanced Representations for Robot Visuo-Motor Policy Learning

Learning visual representations from observing actions to benefit robot visuo-motor policy generation is a promising direction that closely resembles human cognitive function and perception. Motivated by this, and further inspired by…

Robotics · Computer Science 2025-05-28 Nikos Giannakakis , Argyris Manetas , Panagiotis P. Filntisis , Petros Maragos , George Retsinas

Generalizable Task Planning through Representation Pretraining

The ability to plan for multi-step manipulation tasks in unseen situations is crucial for future home robots. But collecting sufficient experience data for end-to-end learning is often infeasible in the real world, as deploying robots in…

Robotics · Computer Science 2022-05-18 Chen Wang , Danfei Xu , Li Fei-Fei

Learning Global Object-Centric Representations via Disentangled Slot Attention

Humans can discern scene-independent features of objects across various environments, allowing them to swiftly identify objects amidst changing factors such as lighting, perspective, size, and position and imagine the complete images of the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Tonglin Chen , Yinxuan Huang , Zhimeng Shen , Jinghao Huang , Bin Li , Xiangyang Xue

FLEX: A Framework for Learning Robot-Agnostic Force-based Skills Involving Sustained Contact Object Manipulation

Learning to manipulate objects efficiently, particularly those involving sustained contact (e.g., pushing, sliding) and articulated parts (e.g., drawers, doors), presents significant challenges. Traditional methods, such as robot-centric…

Robotics · Computer Science 2025-03-18 Shijie Fang , Wenchang Gao , Shivam Goel , Christopher Thierauf , Matthias Scheutz , Jivko Sinapov