English
Related papers

Related papers: 3D Diffuser Actor: Policy Diffusion with 3D Scene …

200 papers

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 12 different tasks from 4…

Robotics · Computer Science 2024-03-15 Cheng Chi , Zhenjia Xu , Siyuan Feng , Eric Cousineau , Yilun Du , Benjamin Burchfiel , Russ Tedrake , Shuran Song

Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations. To tackle this challenging problem, we…

Robotics · Computer Science 2024-09-30 Yanjie Ze , Gu Zhang , Kangning Zhang , Chenyuan Hu , Muhan Wang , Huazhe Xu

We present 3D FlowMatch Actor (3DFA), a 3D policy architecture for robot manipulation that combines flow matching for trajectory prediction with 3D pretrained visual scene representations for learning from demonstration. 3DFA leverages 3D…

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is…

Computer Vision and Pattern Recognition · Computer Science 2023-01-18 Siyuan Huang , Zan Wang , Puhao Li , Baoxiong Jia , Tengyu Liu , Yixin Zhu , Wei Liang , Song-Chun Zhu

Diffusion policies excel at learning complex action distributions for robotic visuomotor tasks, yet their iterative denoising process poses a major bottleneck for real-time deployment. Existing acceleration methods apply a fixed number of…

Robotics · Computer Science 2025-08-12 Shu-Ang Yu , Feng Gao , Yi Wu , Chao Yu , Yu Wang

Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jiahua Ma , Yiran Qin , Yixiong Li , Xuanqi Liao , Yulan Guo , Ruimao Zhang

Robotic manipulation tasks often rely on static cameras for perception, which can limit flexibility, particularly in scenarios like robotic surgery and cluttered environments where mounting static cameras is impractical. Ideally, robots…

Robotics · Computer Science 2025-09-18 Xiatao Sun , Francis Fan , Yinxing Chen , Daniel Rakita

Whole-body control of robotic manipulators with awareness of full-arm kinematics is crucial for many manipulation scenarios involving body collision avoidance or body-object interactions, which makes it insufficient to consider only the…

Robotics · Computer Science 2025-12-22 Kangchen Lv , Mingrui Yu , Yongyi Jia , Chenyu Zhang , Xiang Li

Diffusion-based models for robotic control, including vision-language-action (VLA) and vision-action (VA) policies, have demonstrated significant capabilities. Yet their advancement is constrained by the high cost of acquiring large-scale…

Learning robust visuomotor policies that generalize across diverse objects and interaction dynamics remains a central challenge in robotic manipulation. Most existing approaches rely on direct observation-to-action mappings or compress…

Robotics · Computer Science 2025-09-24 Sangjun Noh , Dongwoo Nam , Kangmin Kim , Geonhyup Lee , Yeonguk Yu , Raeyoung Kang , Kyoobin Lee

Intelligent surgical robots have the potential to revolutionize clinical practice by enabling more precise and automated surgical procedures. However, the automation of such robot for surgical tasks remains under-explored compared to recent…

Robotics · Computer Science 2026-03-10 Chonlam Ho , Jianshu Hu , Lei Song , Hesheng Wang , Qi Dou , Yutong Ban

Acting in human environments is a crucial capability for general-purpose robots, necessitating a robust understanding of natural language and its application to physical tasks. This paper seeks to harness the capabilities of diffusion…

Robotics · Computer Science 2026-04-28 Jonas Bode , Raphael Memmesheimer , Sven Behnke

Diffusion models have seen rapid adoption in robotic imitation learning, enabling autonomous execution of complex dexterous tasks. However, action synthesis is often slow, requiring many steps of iterative denoising, limiting the extent to…

Robotics · Computer Science 2024-10-14 Sigmund H. Høeg , Yilun Du , Olav Egeland

We present a method that reduces, by an order of magnitude, the time and memory needed to train multi-task vision-language robotic diffusion policies. This improvement arises from a previously underexplored distinction between action…

Robotics · Computer Science 2025-06-06 Yutong Hu , Pinhao Song , Kehan Wen , Renaud Detry

Diffusion models have demonstrated remarkable capabilities in image generation tasks, including image editing and video creation, representing a good understanding of the physical world. On the other line, diffusion models have also shown…

Robotics · Computer Science 2024-11-28 Yanjiang Guo , Yucheng Hu , Jianke Zhang , Yen-Jen Wang , Xiaoyu Chen , Chaochao Lu , Jianyu Chen

This paper investigates the application of Diffusion Policy in non-stationary, vision-based RL settings, specifically targeting environments where task dynamics and objectives evolve over time. Our work is grounded in practical challenges…

Artificial Intelligence · Computer Science 2025-04-02 Gunbir Singh Baveja

Imitation Learning presents a promising approach for learning generalizable and complex robotic skills. The recently proposed Diffusion Policy generates robot action sequences through a conditional denoising diffusion process, achieving…

Machine Learning · Computer Science 2024-12-03 Xiu Yuan

Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time.…

Collaborative 3D object detection holds significant importance in the field of autonomous driving, as it greatly enhances the perception capabilities of each individual agent by facilitating information exchange among multiple agents.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Zhe Huang , Shuo Wang , Yongcai Wang , Lei Wang

Robust perception and dynamics modeling are fundamental to real-world robotic policy learning. Recent methods employ video diffusion models (VDMs) to enhance robotic policies, improving their understanding and modeling of the physical…

‹ Prev 1 2 3 10 Next ›