Related papers: 3D Diffuser Actor: Policy Diffusion with 3D Scene …

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 12 different tasks from 4…

Robotics · Computer Science 2024-03-15 Cheng Chi , Zhenjia Xu , Siyuan Feng , Eric Cousineau , Yilun Du , Benjamin Burchfiel , Russ Tedrake , Shuran Song

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations. To tackle this challenging problem, we…

Robotics · Computer Science 2024-09-30 Yanjie Ze , Gu Zhang , Kangning Zhang , Chenyuan Hu , Muhan Wang , Huazhe Xu

3D FlowMatch Actor: Unified 3D Policy for Single- and Dual-Arm Manipulation

We present 3D FlowMatch Actor (3DFA), a 3D policy architecture for robot manipulation that combines flow matching for trajectory prediction with 3D pretrained visual scene representations for learning from demonstration. 3DFA leverages 3D…

Robotics · Computer Science 2025-08-21 Nikolaos Gkanatsios , Jiahe Xu , Matthew Bronars , Arsalan Mousavian , Tsung-Wei Ke , Katerina Fragkiadaki

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is…

Computer Vision and Pattern Recognition · Computer Science 2023-01-18 Siyuan Huang , Zan Wang , Puhao Li , Baoxiong Jia , Tengyu Liu , Yixin Zhu , Wei Liang , Song-Chun Zhu

D3P: Dynamic Denoising Diffusion Policy via Reinforcement Learning

Diffusion policies excel at learning complex action distributions for robotic visuomotor tasks, yet their iterative denoising process poses a major bottleneck for real-time deployment. Existing acceleration methods apply a fixed number of…

Robotics · Computer Science 2025-08-12 Shu-Ang Yu , Feng Gao , Yi Wu , Chao Yu , Yu Wang

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jiahua Ma , Yiran Qin , Yixiong Li , Xuanqi Liao , Yulan Guo , Ruimao Zhang

Optimizing Active Perception for Learning Simultaneous Viewpoint Selection and Manipulation with Diffusion Policy

Robotic manipulation tasks often rely on static cameras for perception, which can limit flexibility, particularly in scenarios like robotic surgery and cluttered environments where mounting static cameras is impractical. Ideally, robots…

Robotics · Computer Science 2025-09-18 Xiatao Sun , Francis Fan , Yinxing Chen , Daniel Rakita

Kinematics-Aware Diffusion Policy with Consistent 3D Observation and Action Space for Whole-Arm Robotic Manipulation

Whole-body control of robotic manipulators with awareness of full-arm kinematics is crucial for many manipulation scenarios involving body collision avoidance or body-object interactions, which makes it insufficient to consider only the…

Robotics · Computer Science 2025-12-22 Kangchen Lv , Mingrui Yu , Yongyi Jia , Chenyu Zhang , Xiang Li

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Diffusion-based models for robotic control, including vision-language-action (VLA) and vision-action (VA) policies, have demonstrated significant capabilities. Yet their advancement is constrained by the high cost of acquiring large-scale…

Robotics · Computer Science 2026-03-11 Jiahang Cao , Yize Huang , Hanzhong Guo , Rui Zhang , Mu Nan , Weijian Mai , Jiaxu Wang , Hao Cheng , Jingkai Sun , Gang Han , Wen Zhao , Qiang Zhang , Yijie Guo , Qihao Zheng , Chunfeng Song , Xiao Li , Ping Luo , Andrew F. Luo

3D Flow Diffusion Policy: Visuomotor Policy Learning via Generating Flow in 3D Space

Learning robust visuomotor policies that generalize across diverse objects and interaction dynamics remains a central challenge in robotic manipulation. Most existing approaches rely on direct observation-to-action mappings or compress…

Robotics · Computer Science 2025-09-24 Sangjun Noh , Dongwoo Nam , Kangmin Kim , Geonhyup Lee , Yeonguk Yu , Raeyoung Kang , Kyoobin Lee

Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations

Intelligent surgical robots have the potential to revolutionize clinical practice by enabling more precise and automated surgical procedures. However, the automation of such robot for surgical tasks remains under-explored compared to recent…

Robotics · Computer Science 2026-03-10 Chonlam Ho , Jianshu Hu , Lei Song , Hesheng Wang , Qi Dou , Yutong Ban

EL3DD: Extended Latent 3D Diffusion for Language Conditioned Multitask Manipulation

Acting in human environments is a crucial capability for general-purpose robots, necessitating a robust understanding of natural language and its application to physical tasks. This paper seeks to harness the capabilities of diffusion…

Robotics · Computer Science 2026-04-28 Jonas Bode , Raphael Memmesheimer , Sven Behnke

Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models

Diffusion models have seen rapid adoption in robotic imitation learning, enabling autonomous execution of complex dexterous tasks. However, action synthesis is often slow, requiring many steps of iterative denoising, limiting the extent to…

Robotics · Computer Science 2024-10-14 Sigmund H. Høeg , Yilun Du , Olav Egeland

Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches

We present a method that reduces, by an order of magnitude, the time and memory needed to train multi-task vision-language robotic diffusion policies. This improvement arises from a previously underexplored distinction between action…

Robotics · Computer Science 2025-06-06 Yutong Hu , Pinhao Song , Kehan Wen , Renaud Detry

Prediction with Action: Visual Policy Learning via Joint Denoising Process

Diffusion models have demonstrated remarkable capabilities in image generation tasks, including image editing and video creation, representing a good understanding of the physical world. On the other line, diffusion models have also shown…

Robotics · Computer Science 2024-11-28 Yanjiang Guo , Yucheng Hu , Jianke Zhang , Yen-Jen Wang , Xiaoyu Chen , Chaochao Lu , Jianyu Chen

Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies

This paper investigates the application of Diffusion Policy in non-stationary, vision-based RL settings, specifically targeting environments where task dynamics and objectives evolve over time. Our work is grounded in practical challenges…

Artificial Intelligence · Computer Science 2025-04-02 Gunbir Singh Baveja

Unpacking the Individual Components of Diffusion Policy

Imitation Learning presents a promising approach for learning generalizable and complex robotic skills. The recently proposed Diffusion Policy generates robot action sequences through a conditional denoising diffusion process, achieving…

Machine Learning · Computer Science 2024-12-03 Xiu Yuan

Inference-stage Adaptation-projection Strategy Adapts Diffusion Policy to Cross-manipulators Scenarios

Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time.…

Robotics · Computer Science 2025-09-16 Xiangtong Yao , Yirui Zhou , Yuan Meng , Yanwen Liu , Liangyu Dong , Zitao Zhang , Zhenshan Bing , Kai Huang , Fuchun Sun , Alois Knoll

CoDiff: Conditional Diffusion Model for Collaborative 3D Object Detection

Collaborative 3D object detection holds significant importance in the field of autonomous driving, as it greatly enhances the perception capabilities of each individual agent by facilitating information exchange among multiple agents.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Zhe Huang , Shuo Wang , Yongcai Wang , Lei Wang

Video2Act: A Dual-System Video Diffusion Policy with Robotic Spatio-Motional Modeling

Robust perception and dynamics modeling are fundamental to real-world robotic policy learning. Recent methods employ video diffusion models (VDMs) to enhance robotic policies, improving their understanding and modeling of the physical…

Robotics · Computer Science 2026-03-25 Yueru Jia , Jiaming Liu , Shengbang Liu , Rui Zhou , Wanhe Yu , Yuyang Yan , Xiaowei Chi , Yandong Guo , Boxin Shi , Shanghang Zhang