Related papers: DALL-E-Bot: Introducing Web-Scale Diffusion Models…

Illiterate DALL-E Learns to Compose

Although DALL-E has shown an impressive ability of composition-based systematic generalization in image generation, it requires the dataset of text-image pairs and the compositionality is provided by the text. In contrast, object-centric…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Gautam Singh , Fei Deng , Sungjin Ahn

DARE: Diffusion Policy for Autonomous Robot Exploration

Autonomous robot exploration requires a robot to efficiently explore and map unknown environments. Compared to conventional methods that can only optimize paths based on the current robot belief, learning-based methods show the potential to…

Robotics · Computer Science 2024-10-23 Yuhong Cao , Jeric Lew , Jingsong Liang , Jin Cheng , Guillaume Sartoretti

Scaling Robot Learning with Semantically Imagined Experience

Recent advances in robot learning have shown promise in enabling robots to perform a variety of manipulation tasks and generalize to novel scenarios. One of the key contributing factors to this progress is the scale of robot data used to…

Robotics · Computer Science 2023-02-23 Tianhe Yu , Ted Xiao , Austin Stone , Jonathan Tompson , Anthony Brohan , Su Wang , Jaspiar Singh , Clayton Tan , Dee M , Jodilyn Peralta , Brian Ichter , Karol Hausman , Fei Xia

Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models

If generalist robots are to operate in truly unstructured environments, they need to be able to recognize and reason about novel objects and scenarios. Such objects and scenarios might not be present in the robot's own training data. We…

Robotics · Computer Science 2023-10-17 Kevin Black , Mitsuhiko Nakamoto , Pranav Atreya , Homer Walke , Chelsea Finn , Aviral Kumar , Sergey Levine

Implementing and Experimenting with Diffusion Models for Text-to-Image Generation

Taking advantage of the many recent advances in deep learning, text-to-image generative models currently have the merit of attracting the general public attention. Two of these models, DALL-E 2 and Imagen, have demonstrated that highly…

Computer Vision and Pattern Recognition · Computer Science 2022-09-23 Robin Zbinden

Diffusion Models for Robotic Manipulation: A Survey

Diffusion generative models have demonstrated remarkable success in visual domains such as image and video generation. They have also recently emerged as a promising approach in robotics, especially in robot manipulations. Diffusion models…

Robotics · Computer Science 2025-07-15 Rosa Wolf , Yitian Shi , Sheng Liu , Rania Rayyes

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models

Recently, DALL-E, a multimodal transformer language model, and its variants, including diffusion models, have shown high-quality text-to-image generation capabilities. However, despite the realistic image generation results, there has not…

Computer Vision and Pattern Recognition · Computer Science 2023-09-01 Jaemin Cho , Abhay Zala , Mohit Bansal

ReorientDiff: Diffusion Model based Reorientation for Object Manipulation

The ability to manipulate objects in a desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly,…

Robotics · Computer Science 2023-09-18 Utkarsh A. Mishra , Yongxin Chen

Directed Diffusion: Direct Control of Object Placement through Attention Guidance

Text-guided diffusion models such as DALLE-2, Imagen, eDiff-I, and Stable Diffusion are able to generate an effectively endless variety of images given only a short text prompt describing the desired image content. In many cases the images…

Computer Vision and Pattern Recognition · Computer Science 2023-09-27 Wan-Duo Kurt Ma , J. P. Lewis , Avisek Lahiri , Thomas Leung , W. Bastiaan Kleijn

CoBL-Diffusion: Diffusion-Based Conditional Robot Planning in Dynamic Environments Using Control Barrier and Lyapunov Functions

Equipping autonomous robots with the ability to navigate safely and efficiently around humans is a crucial step toward achieving trusted robot autonomy. However, generating robot plans while ensuring safety in dynamic multi-agent…

Robotics · Computer Science 2024-11-14 Kazuki Mizuta , Karen Leung

StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

Robots operating in human environments must be able to rearrange objects into semantically-meaningful configurations, even if these objects are previously unseen. In this work, we focus on the problem of building physically-valid structures…

Robotics · Computer Science 2023-04-26 Weiyu Liu , Yilun Du , Tucker Hermans , Sonia Chernova , Chris Paxton

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples training data generation…

Computer Vision and Pattern Recognition · Computer Science 2022-12-23 Yunhao Ge , Jiashu Xu , Brian Nlong Zhao , Neel Joshi , Laurent Itti , Vibhav Vineet

Diffusing in Someone Else's Shoes: Robotic Perspective Taking with Diffusion

Humanoid robots can benefit from their similarity to the human shape by learning from humans. When humans teach other humans how to perform actions, they often demonstrate the actions, and the learning human imitates the demonstration to…

Robotics · Computer Science 2024-10-07 Josua Spisak , Matthias Kerzel , Stefan Wermter

Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving

Diffusion models have become a popular choice for decision-making tasks in robotics, and more recently, are also being considered for solving autonomous driving tasks. However, their applications and evaluations in autonomous driving remain…

Robotics · Computer Science 2026-05-19 Yinan Zheng , Tianyi Tan , Bin Huang , Enguang Liu , Ruiming Liang , Jianlin Zhang , Jianwei Cui , Guang Chen , Kun Ma , Hangjun Ye , Long Chen , Ya-Qin Zhang , Xianyuan Zhan , Jingjing Liu

Pixel Motion Diffusion is What We Need for Robot Control

We present DAWN (Diffusion is All We Need for robot control), a unified diffusion-based framework for language-conditioned robotic manipulation that bridges high-level motion intent and low-level robot action via structured pixel motion…

Robotics · Computer Science 2026-04-03 E-Ro Nguyen , Yichi Zhang , Kanchana Ranasinghe , Xiang Li , Michael S. Ryoo

Joint Localization and Planning using Diffusion

Diffusion models have been successfully applied to robotics problems such as manipulation and vehicle path planning. In this work, we explore their application to end-to-end navigation -- including both perception and planning -- by…

Robotics · Computer Science 2024-09-27 L. Lao Beyer , S. Karaman

DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models

Nature evolves creatures with a high complexity of morphological and behavioral intelligence, meanwhile computational methods lag in approaching that diversity and efficacy. Co-optimization of artificial creatures' morphology and control in…

Robotics · Computer Science 2023-11-29 Tsun-Hsuan Wang , Juntian Zheng , Pingchuan Ma , Yilun Du , Byungchul Kim , Andrew Spielberg , Joshua Tenenbaum , Chuang Gan , Daniela Rus

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

Evaluating robotics policies across thousands of environments and thousands of tasks is infeasible with existing approaches. This motivates the need for a new methodology for scalable robotics policy evaluation. In this paper, we propose…

Robotics · Computer Science 2026-04-27 Yaxuan Li , Zhongyi Zhou , Yefei Chen , Yaokai Xue , Yichen Zhu

Diffusion Based Robust LiDAR Place Recognition

Mobile robots on construction sites require accurate pose estimation to perform autonomous surveying and inspection missions. Localization in construction sites is a particularly challenging problem due to the presence of repetitive…

Robotics · Computer Science 2025-04-18 Benjamin Krummenacher , Jonas Frey , Turcan Tuna , Olga Vysotska , Marco Hutter

To the Noise and Back: Diffusion for Shared Autonomy

Shared autonomy is an operational concept in which a user and an autonomous agent collaboratively control a robotic system. It provides a number of advantages over the extremes of full-teleoperation and full-autonomy in many settings.…

Robotics · Computer Science 2025-08-28 Takuma Yoneda , Luzhe Sun , Ge Yang , Bradly Stadie , Matthew Walter