Related papers: Semantically Controllable Augmentations for Genera…

GenAug: Retargeting behaviors to unseen situations via Generative Augmentation

Robot learning methods have the potential for widespread generalization across tasks, environments, and objects. However, these methods require large diverse datasets that are expensive to collect in real-world robotics settings. For robot…

Robotics · Computer Science 2023-02-24 Zoey Chen , Sho Kiami , Abhishek Gupta , Vikash Kumar

Stronger Generalization Guarantees for Robot Learning by Combining Generative Models and Real-World Data

We are motivated by the problem of learning policies for robotic systems with rich sensory inputs (e.g., vision) in a manner that allows us to guarantee generalization to environments unseen during training. We provide a framework for…

Robotics · Computer Science 2022-07-25 Abhinav Agarwal , Sushant Veer , Allen Z. Ren , Anirudha Majumdar

Generalizable Task Planning through Representation Pretraining

The ability to plan for multi-step manipulation tasks in unseen situations is crucial for future home robots. But collecting sufficient experience data for end-to-end learning is often infeasible in the real world, as deploying robots in…

Robotics · Computer Science 2022-05-18 Chen Wang , Danfei Xu , Li Fei-Fei

From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation

Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore,…

Robotics · Computer Science 2026-04-20 Jasper Lu , Zhenhao Shen , Yuanfei Wang , Shugao Liu , Shengqiang Xu , Shawn Xie , Jingkai Xu , Feng Jiang , Jade Yang , Chen Xie , Ruihai Wu

Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models

Generalist robot manipulators need to learn a wide variety of manipulation skills across diverse environments. Current robot training pipelines rely on humans to provide kinesthetic demonstrations or to program simulation environments and…

Robotics · Computer Science 2023-10-30 Pushkal Katara , Zhou Xian , Katerina Fragkiadaki

Scalable Policy Evaluation with Video World Models

Training generalist policies for robotic manipulation has shown great promise, as they enable language-conditioned, multi-task behaviors across diverse scenarios. However, evaluating these policies remains difficult because real-world…

Robotics · Computer Science 2025-12-05 Wei-Cheng Tseng , Jinwei Gu , Qinsheng Zhang , Hanzi Mao , Ming-Yu Liu , Florian Shkurti , Lin Yen-Chen

ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection

When designing robots to assist in everyday human activities, it is crucial to enhance user requests with visual cues from their surroundings for improved intent understanding. This process is defined as a multimodal classification task.…

Computation and Language · Computer Science 2025-06-18 Shang-Chi Tsai , Seiya Kawano , Angel Garcia Contreras , Koichiro Yoshino , Yun-Nung Chen

IGen: Scalable Data Generation for Robot Learning from Open-World Images

The rise of generalist robotic policies has created an exponential demand for large-scale training data. However, on-robot data collection is labor-intensive and often limited to specific environments. In contrast, open-world images capture…

Robotics · Computer Science 2026-04-16 Chenghao Gu , Haolan Kang , Junchao Lin , Jinghe Wang , Duo Wu , Shuzhao Xie , Fanding Huang , Junchen Ge , Ziyang Gong , Letian Li , Hongying Zheng , Changwei Lv , Zhi Wang

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Generalist robot policies can now perform a wide range of manipulation skills, but evaluating and improving their ability with unfamiliar objects and instructions remains a significant challenge. Rigorous evaluation requires a large number…

Robotics · Computer Science 2026-03-03 Yanjiang Guo , Lucy Xiaoyang Shi , Jianyu Chen , Chelsea Finn

Semi-Supervised and Task-Driven Data Augmentation

Supervised deep learning methods for segmentation require large amounts of labelled training data, without which they are prone to overfitting, not generalizing well to unseen images. In practice, obtaining a large number of annotations…

Computer Vision and Pattern Recognition · Computer Science 2019-03-01 Krishna Chaitanya , Neerav Karani , Christian Baumgartner , Olivio Donati , Anton Becker , Ender Konukoglu

Dreamitate: Real-World Visuomotor Policy Learning via Video Generation

A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments. A promising mechanism for learning robust policies is to leverage video generative models, which are pretrained on large-scale…

Robotics · Computer Science 2024-06-25 Junbang Liang , Ruoshi Liu , Ege Ozguroglu , Sruthi Sudhakar , Achal Dave , Pavel Tokmakov , Shuran Song , Carl Vondrick

Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance

Data augmentation is crucial for pixel-wise annotation tasks like semantic segmentation, where labeling requires significant effort and intensive labor. Traditional methods, involving simple transformations such as rotations and flips,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Quang-Huy Che , Duc-Tri Le , Bich-Nga Pham , Duc-Khai Lam , Vinh-Tiep Nguyen

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation…

Robotics · Computer Science 2023-12-22 Hongtao Wu , Ya Jing , Chilam Cheang , Guangzeng Chen , Jiafeng Xu , Xinghang Li , Minghuan Liu , Hang Li , Tao Kong

GenSim: Generating Robotic Simulation Tasks via Large Language Models

Collecting large amounts of real-world interaction data to train general robotic policies is often prohibitively expensive, thus motivating the use of simulation data. However, existing methods for data generation have generally focused on…

Machine Learning · Computer Science 2024-01-23 Lirui Wang , Yiyang Ling , Zhecheng Yuan , Mohit Shridhar , Chen Bao , Yuzhe Qin , Bailin Wang , Huazhe Xu , Xiaolong Wang

Scaling Single Human Demonstrations for Imitation Learning using Generative Foundational Models

Imitation learning is a popular paradigm to teach robots new tasks, but collecting robot demonstrations through teleoperation or kinesthetic teaching is tedious and time-consuming. In contrast, directly demonstrating a task using our human…

Robotics · Computer Science 2026-02-16 Nick Heppert , Minh Quang Nguyen , Abhinav Valada

Data Augmentation for Manipulation

The success of deep learning depends heavily on the availability of large datasets, but in robotic manipulation there are many learning problems for which such datasets do not exist. Collecting these datasets is time-consuming and…

Robotics · Computer Science 2022-07-21 Peter Mitrano , Dmitry Berenson

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL) has motivated similar approaches for fine-tuning vision-language-action (VLA) models in robotics. Many recent works fine-tune VLAs…

Robotics · Computer Science 2026-03-31 Andrew Choi , Xinjie Wang , Zhizhong Su , Wei Xu

Semantic Augmentation in Images using Language

Deep Learning models are incredibly data-hungry and require very large labeled datasets for supervised learning. As a consequence, these models often suffer from overfitting, limiting their ability to generalize to real-world examples.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-16 Sahiti Yerramilli , Jayant Sravan Tamarapalli , Tanmay Girish Kulkarni , Jonathan Francis , Eric Nyberg

Exploring the Equivalence of Closed-Set Generative and Real Data Augmentation in Image Classification

In this paper, we address a key scientific problem in machine learning: Given a training set for an image classification task, can we train a generative model on this dataset to enhance the classification performance? (i.e., closed-set…

Computer Vision and Pattern Recognition · Computer Science 2025-08-14 Haowen Wang , Guowei Zhang , Xiang Zhang , Zeyuan Chen , Haiyang Xu , Dou Hoon Kwark , Zhuowen Tu

Scaling Robot Learning with Semantically Imagined Experience

Recent advances in robot learning have shown promise in enabling robots to perform a variety of manipulation tasks and generalize to novel scenarios. One of the key contributing factors to this progress is the scale of robot data used to…

Robotics · Computer Science 2023-02-23 Tianhe Yu , Ted Xiao , Austin Stone , Jonathan Tompson , Anthony Brohan , Su Wang , Jaspiar Singh , Clayton Tan , Dee M , Jodilyn Peralta , Brian Ichter , Karol Hausman , Fei Xia