Related papers: Motion Before Action: Diffusing Object Motion as M…

Motion Planning Diffusion: Learning and Planning of Robot Motions with Diffusion Models

Learning priors on trajectory distributions can help accelerate robot motion planning optimization. Given previously successful plans, learning trajectory generative models as priors for a new planning problem is highly desirable. Prior…

Robotics · Computer Science 2024-03-27 Joao Carvalho , An T. Le , Mark Baierl , Dorothea Koert , Jan Peters

Learning Coordinated Bimanual Manipulation Policies using State Diffusion and Inverse Dynamics Models

When performing tasks like laundry, humans naturally coordinate both hands to manipulate objects and anticipate how their actions will change the state of the clothes. However, achieving such coordination in robotics remains challenging due…

Robotics · Computer Science 2025-04-01 Haonan Chen , Jiaming Xu , Lily Sheng , Tianchen Ji , Shuijing Liu , Yunzhu Li , Katherine Driggs-Campbell

E-Motion: Future Motion Simulation via Event Sequence Diffusion

Forecasting a typical object's future motion is a critical task for interpreting and interacting with dynamic environments in computer vision. Event-based sensors, which could capture changes in the scene with exceptional temporal…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Song Wu , Zhiyu Zhu , Junhui Hou , Guangming Shi , Jinjian Wu

CoVAR: Co-generation of Video and Action for Robotic Manipulation via Multi-Modal Diffusion

We present a method to generate video-action pairs that follow text instructions, starting from an initial image observation and the robot's joint states. Our approach automatically provides action labels for video diffusion models,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Liudi Yang , Yang Bai , George Eskandar , Fengyi Shen , Mohammad Altillawi , Dong Chen , Ziyuan Liu , Abhinav Valada

Robotic Imitation of Human Actions

Imitation can allow us to quickly gain an understanding of a new task. Through a demonstration, we can gain direct knowledge about which actions need to be performed and which goals they have. In this paper, we introduce a new approach to…

Robotics · Computer Science 2024-06-04 Josua Spisak , Matthias Kerzel , Stefan Wermter

Diffusion-Based Imaginative Coordination for Bimanual Manipulation

Bimanual manipulation is crucial in robotics, enabling complex tasks in industrial automation and household services. However, it poses significant challenges due to the high-dimensional action space and intricate coordination requirements.…

Robotics · Computer Science 2025-07-16 Huilin Xu , Jian Ding , Jiakun Xu , Ruixiang Wang , Jun Chen , Jinjie Mai , Yanwei Fu , Bernard Ghanem , Feng Xu , Mohamed Elhoseiny

From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation

Video generation models offer a promising imagination mechanism for robot manipulation by predicting long-horizon future observations, but effectively exploiting these imagined futures for action execution remains challenging. Existing…

Robotics · Computer Science 2026-05-13 Yajie Li , Bozhou Zhang , Chun Gu , Zipei Ma , Jiahui Zhang , Jiankang Deng , Xiatian Zhu , Li Zhang

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 12 different tasks from 4…

Robotics · Computer Science 2024-03-15 Cheng Chi , Zhenjia Xu , Siyuan Feng , Eric Cousineau , Yilun Du , Benjamin Burchfiel , Russ Tedrake , Shuran Song

Robotic VLA Benefits from Joint Learning with Motion Image Diffusion

Vision-Language-Action (VLA) models have achieved remarkable progress in robotic manipulation by mapping multimodal observations and instructions directly to actions. However, they typically mimic expert trajectories without predictive…

Robotics · Computer Science 2025-12-23 Yu Fang , Kanchana Ranasinghe , Le Xue , Honglu Zhou , Juntao Tan , Ran Xu , Shelby Heinecke , Caiming Xiong , Silvio Savarese , Daniel Szafir , Mingyu Ding , Michael S. Ryoo , Juan Carlos Niebles

Motion Planning Diffusion: Learning and Adapting Robot Motion Planning with Diffusion Models

The performance of optimization-based robot motion planning algorithms is highly dependent on the initial solutions, commonly obtained by running a sampling-based planner to obtain a collision-free path. However, these methods can be slow…

Robotics · Computer Science 2025-08-15 J. Carvalho , A. Le , P. Kicki , D. Koert , J. Peters

DemoDiffusion: One-Shot Human Imitation using pre-trained Diffusion Policy

We propose DemoDiffusion, a simple method for enabling robots to perform manipulation tasks by imitating a single human demonstration, without requiring task-specific training or paired human-robot data. Our approach is based on two…

Robotics · Computer Science 2026-03-10 Sungjae Park , Homanga Bharadhwaj , Shubham Tulsiani

Controllable Motion Generation via Diffusion Modal Coupling

Diffusion models have recently gained significant attention in robotics due to their ability to generate multi-modal distributions of system states and behaviors. However, a key challenge remains: ensuring precise control over the generated…

Robotics · Computer Science 2025-10-01 Luobin Wang , Hongzhan Yu , Chenning Yu , Sicun Gao , Henrik Christensen

Causal World Modeling for Robot Control

This work highlights that video world modeling, alongside vision-language pre-training, establishes a fresh and independent foundation for robot learning. Intuitively, video world models provide the ability to imagine the near future by…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Lin Li , Qihang Zhang , Yiming Luo , Shuai Yang , Ruilin Wang , Fei Han , Mingrui Yu , Zelin Gao , Nan Xue , Xing Zhu , Yujun Shen , Yinghao Xu

MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertainty

This paper introduces a Multi-modal Diffusion model for Motion Prediction (MDMP) that integrates and synchronizes skeletal data and textual descriptions of actions to generate refined long-term motion predictions with quantifiable…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Leo Bringer , Joey Wilson , Kira Barton , Maani Ghaffari

Multi-Robot Motion Planning with Diffusion Models

Diffusion models have recently been successfully applied to a wide range of robotics applications for learning complex multi-modal behaviors from data. However, prior works have mostly been confined to single-robot and small-scale…

Robotics · Computer Science 2025-05-08 Yorai Shaoul , Itamar Mishani , Shivam Vats , Jiaoyang Li , Maxim Likhachev

Learning Human Motion with Temporally Conditional Mamba

Learning human motion based on a time-dependent input signal presents a challenging yet impactful task with various applications. The goal of this task is to generate or estimate human movement that consistently reflects the temporal…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Quang Nguyen , Tri Le , Baoru Huang , Minh Nhat Vu , Ngan Le , Thieu Vo , Anh Nguyen

Object Motion Guided Human Motion Synthesis

Modeling human behaviors in contextual environments has a wide range of applications in character animation, embodied AI, VR/AR, and robotics. In real-world scenarios, humans frequently interact with the environment and manipulate various…

Computer Vision and Pattern Recognition · Computer Science 2023-09-29 Jiaman Li , Jiajun Wu , C. Karen Liu

SPOT: SE(3) Pose Trajectory Diffusion for Object-Centric Manipulation

We introduce SPOT, an object-centric imitation learning framework. The key idea is to capture each task by an object-centric representation, specifically the SE(3) object pose trajectory relative to the target. This approach decouples…

Robotics · Computer Science 2025-05-15 Cheng-Chun Hsu , Bowen Wen , Jie Xu , Yashraj Narang , Xiaolong Wang , Yuke Zhu , Joydeep Biswas , Stan Birchfield

Exploring Conditions for Diffusion models in Robotic Control

While pre-trained visual representations have significantly advanced imitation learning, they are often task-agnostic as they remain frozen during policy learning. In this work, we explore leveraging pre-trained text-to-image diffusion…

Computer Vision and Pattern Recognition · Computer Science 2026-04-09 Heeseong Shin , Byeongho Heo , Dongyoon Han , Seungryong Kim , Taekyung Kim

RoLD: Robot Latent Diffusion for Multi-task Policy Modeling

Modeling generalized robot control policies poses ongoing challenges for language-guided robot manipulation tasks. Existing methods often struggle to efficiently utilize cross-dataset resources or rely on resource-intensive vision-language…

Robotics · Computer Science 2024-11-05 Wenhui Tan , Bei Liu , Junbo Zhang , Ruihua Song , Jianlong Fu