English
Related papers

Related papers: Semantically Controllable Augmentations for Genera…

200 papers

Robot learning methods have the potential for widespread generalization across tasks, environments, and objects. However, these methods require large diverse datasets that are expensive to collect in real-world robotics settings. For robot…

Robotics · Computer Science 2023-02-24 Zoey Chen , Sho Kiami , Abhishek Gupta , Vikash Kumar

We are motivated by the problem of learning policies for robotic systems with rich sensory inputs (e.g., vision) in a manner that allows us to guarantee generalization to environments unseen during training. We provide a framework for…

Robotics · Computer Science 2022-07-25 Abhinav Agarwal , Sushant Veer , Allen Z. Ren , Anirudha Majumdar

The ability to plan for multi-step manipulation tasks in unseen situations is crucial for future home robots. But collecting sufficient experience data for end-to-end learning is often infeasible in the real world, as deploying robots in…

Robotics · Computer Science 2022-05-18 Chen Wang , Danfei Xu , Li Fei-Fei

Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore,…

Generalist robot manipulators need to learn a wide variety of manipulation skills across diverse environments. Current robot training pipelines rely on humans to provide kinesthetic demonstrations or to program simulation environments and…

Robotics · Computer Science 2023-10-30 Pushkal Katara , Zhou Xian , Katerina Fragkiadaki

Training generalist policies for robotic manipulation has shown great promise, as they enable language-conditioned, multi-task behaviors across diverse scenarios. However, evaluating these policies remains difficult because real-world…

Robotics · Computer Science 2025-12-05 Wei-Cheng Tseng , Jinwei Gu , Qinsheng Zhang , Hanzi Mao , Ming-Yu Liu , Florian Shkurti , Lin Yen-Chen

When designing robots to assist in everyday human activities, it is crucial to enhance user requests with visual cues from their surroundings for improved intent understanding. This process is defined as a multimodal classification task.…

Computation and Language · Computer Science 2025-06-18 Shang-Chi Tsai , Seiya Kawano , Angel Garcia Contreras , Koichiro Yoshino , Yun-Nung Chen

The rise of generalist robotic policies has created an exponential demand for large-scale training data. However, on-robot data collection is labor-intensive and often limited to specific environments. In contrast, open-world images capture…

Generalist robot policies can now perform a wide range of manipulation skills, but evaluating and improving their ability with unfamiliar objects and instructions remains a significant challenge. Rigorous evaluation requires a large number…

Robotics · Computer Science 2026-03-03 Yanjiang Guo , Lucy Xiaoyang Shi , Jianyu Chen , Chelsea Finn

Supervised deep learning methods for segmentation require large amounts of labelled training data, without which they are prone to overfitting, not generalizing well to unseen images. In practice, obtaining a large number of annotations…

Computer Vision and Pattern Recognition · Computer Science 2019-03-01 Krishna Chaitanya , Neerav Karani , Christian Baumgartner , Olivio Donati , Anton Becker , Ender Konukoglu

A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments. A promising mechanism for learning robust policies is to leverage video generative models, which are pretrained on large-scale…

Data augmentation is crucial for pixel-wise annotation tasks like semantic segmentation, where labeling requires significant effort and intensive labor. Traditional methods, involving simple transformations such as rotations and flips,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Quang-Huy Che , Duc-Tri Le , Bich-Nga Pham , Duc-Khai Lam , Vinh-Tiep Nguyen

Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation…

Robotics · Computer Science 2023-12-22 Hongtao Wu , Ya Jing , Chilam Cheang , Guangzeng Chen , Jiafeng Xu , Xinghang Li , Minghuan Liu , Hang Li , Tao Kong

Collecting large amounts of real-world interaction data to train general robotic policies is often prohibitively expensive, thus motivating the use of simulation data. However, existing methods for data generation have generally focused on…

Machine Learning · Computer Science 2024-01-23 Lirui Wang , Yiyang Ling , Zhecheng Yuan , Mohit Shridhar , Chen Bao , Yuzhe Qin , Bailin Wang , Huazhe Xu , Xiaolong Wang

Imitation learning is a popular paradigm to teach robots new tasks, but collecting robot demonstrations through teleoperation or kinesthetic teaching is tedious and time-consuming. In contrast, directly demonstrating a task using our human…

Robotics · Computer Science 2026-02-16 Nick Heppert , Minh Quang Nguyen , Abhinav Valada

The success of deep learning depends heavily on the availability of large datasets, but in robotic manipulation there are many learning problems for which such datasets do not exist. Collecting these datasets is time-consuming and…

Robotics · Computer Science 2022-07-21 Peter Mitrano , Dmitry Berenson

The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL) has motivated similar approaches for fine-tuning vision-language-action (VLA) models in robotics. Many recent works fine-tune VLAs…

Robotics · Computer Science 2026-03-31 Andrew Choi , Xinjie Wang , Zhizhong Su , Wei Xu

Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-16 Sahiti Yerramilli , Jayant Sravan Tamarapalli , Tanmay Girish Kulkarni , Jonathan Francis , Eric Nyberg

In this paper, we address a key scientific problem in machine learning: Given a training set for an image classification task, can we train a generative model on this dataset to enhance the classification performance? (i.e., closed-set…

Computer Vision and Pattern Recognition · Computer Science 2025-08-14 Haowen Wang , Guowei Zhang , Xiang Zhang , Zeyuan Chen , Haiyang Xu , Dou Hoon Kwark , Zhuowen Tu

Recent advances in robot learning have shown promise in enabling robots to perform a variety of manipulation tasks and generalize to novel scenarios. One of the key contributing factors to this progress is the scale of robot data used to…

‹ Prev 1 2 3 10 Next ›