Related papers: Optimizing Active Perception for Learning Simultan…

A General One-Shot Multimodal Active Perception Framework for Robotic Manipulation: Learning to Predict Optimal Viewpoint

Active perception in vision-based robotic manipulation aims to move the camera toward more informative observation viewpoints, thereby providing high-quality perceptual inputs for downstream tasks. Most existing active perception methods…

Robotics · Computer Science 2026-01-21 Deyun Qin , Zezhi Liu , Hanqian Luo , Xiao Liang , Yongchun Fang

Kinematics-Aware Diffusion Policy with Consistent 3D Observation and Action Space for Whole-Arm Robotic Manipulation

Whole-body control of robotic manipulators with awareness of full-arm kinematics is crucial for many manipulation scenarios involving body collision avoidance or body-object interactions, which makes it insufficient to consider only the…

Robotics · Computer Science 2025-12-22 Kangchen Lv , Mingrui Yu , Yongyi Jia , Chenyu Zhang , Xiang Li

Diffusion-Based Imaginative Coordination for Bimanual Manipulation

Bimanual manipulation is crucial in robotics, enabling complex tasks in industrial automation and household services. However, it poses significant challenges due to the high-dimensional action space and intricate coordination requirements.…

Robotics · Computer Science 2025-07-16 Huilin Xu , Jian Ding , Jiakun Xu , Ruixiang Wang , Jun Chen , Jinjie Mai , Yanwei Fu , Bernard Ghanem , Feng Xu , Mohamed Elhoseiny

Active Perception and Representation for Robotic Manipulation

The vast majority of visual animals actively control their eyes, heads, and/or bodies to direct their gaze toward different parts of their environment. In contrast, recent applications of reinforcement learning in robotic manipulation…

Computer Vision and Pattern Recognition · Computer Science 2020-03-17 Youssef Zaky , Gaurav Paruthi , Bryan Tripp , James Bergstra

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 12 different tasks from 4…

Robotics · Computer Science 2024-03-15 Cheng Chi , Zhenjia Xu , Siyuan Feng , Eric Cousineau , Yilun Du , Benjamin Burchfiel , Russ Tedrake , Shuran Song

Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation

Robots hold great promise for performing repetitive or hazardous tasks, but achieving human-like dexterity, especially in contact-rich and dynamic environments, remains challenging. Rigid robots, which rely on position or velocity control,…

Robotics · Computer Science 2024-10-28 Malek Aburub , Cristian C. Beltran-Hernandez , Tatsuya Kamijo , Masashi Hamaya

Learning to View: Decision Transformers for Active Object Detection

Active perception describes a broad class of techniques that couple planning and perception systems to move the robot in a way to give the robot more information about the environment. In most robotic systems, perception is typically…

Robotics · Computer Science 2023-01-24 Wenhao Ding , Nathalie Majcherczyk , Mohit Deshpande , Xuewei Qi , Ding Zhao , Rajasimman Madhivanan , Arnie Sen

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

The use of multi-camera views simultaneously has been shown to improve the generalization capabilities and performance of visual policies. However, the hardware cost and design constraints in real-world scenarios can potentially make it…

Robotics · Computer Science 2023-12-05 Cihan Acar , Kuluhan Binici , Alp Tekirdağ , Yan Wu

Learning Coordinated Bimanual Manipulation Policies using State Diffusion and Inverse Dynamics Models

When performing tasks like laundry, humans naturally coordinate both hands to manipulate objects and anticipate how their actions will change the state of the clothes. However, achieving such coordination in robotics remains challenging due…

Robotics · Computer Science 2025-04-01 Haonan Chen , Jiaming Xu , Lily Sheng , Tianchen Ji , Shuijing Liu , Yunzhu Li , Katherine Driggs-Campbell

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jiahua Ma , Yiran Qin , Yixiong Li , Xuanqi Liao , Yulan Guo , Ruimao Zhang

Inference-stage Adaptation-projection Strategy Adapts Diffusion Policy to Cross-manipulators Scenarios

Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time.…

Robotics · Computer Science 2025-09-16 Xiangtong Yao , Yirui Zhou , Yuan Meng , Yanwen Liu , Liangyu Dong , Zitao Zhang , Zhenshan Bing , Kai Huang , Fuchun Sun , Alois Knoll

Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations

Intelligent surgical robots have the potential to revolutionize clinical practice by enabling more precise and automated surgical procedures. However, the automation of such robot for surgical tasks remains under-explored compared to recent…

Robotics · Computer Science 2026-03-10 Chonlam Ho , Jianshu Hu , Lei Song , Hesheng Wang , Qi Dou , Yutong Ban

Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training

Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks…

Machine Learning · Computer Science 2024-10-10 Haoran He , Chenjia Bai , Ling Pan , Weinan Zhang , Bin Zhao , Xuelong Li

Interactive Perception for Deformable Object Manipulation

Interactive perception enables robots to manipulate the environment and objects to bring them into states that benefit the perception process. Deformable objects pose challenges to this due to significant manipulation difficulty and…

Robotics · Computer Science 2024-10-27 Zehang Weng , Peng Zhou , Hang Yin , Alexander Kravberg , Anastasiia Varava , David Navarro-Alarcon , Danica Kragic

Viewpoint-Agnostic Manipulation Policies with Strategic Vantage Selection

Since vision-based manipulation policies are typically trained from data gathered from a single viewpoint, their performance drops when the view changes during deployment. Naively aggregating demonstrations from numerous random views is not…

Robotics · Computer Science 2025-10-07 Sreevishakh Vasudevan , Som Sagar , Ransalu Senanayake

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Learned visuomotor policies have shown considerable success as an alternative to traditional, hand-crafted frameworks for robotic manipulation. Surprisingly, an extension of these methods to the multiview domain is relatively unexplored. A…

Robotics · Computer Science 2022-07-11 Trevor Ablett , Yifan Zhai , Jonathan Kelly

ForeDiffusion: Foresight-Conditioned Diffusion Policy via Future View Construction for Robot Manipulation

Diffusion strategies have advanced visual motor control by progressively denoising high-dimensional action sequences, providing a promising method for robot manipulation. However, as task complexity increases, the success rate of existing…

Robotics · Computer Science 2026-01-21 Weize Xie , Yi Ding , Ying He , Leilei Wang , Binwen Bai , Zheyi Zhao , Chenyang Wang , F. Richard Yu

Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation

Imitation learning with diffusion models has advanced robotic control by capturing the multi-modal action distributions. However, existing methods typically treat observations only as high-level conditions to the denoising network, rather…

Artificial Intelligence · Computer Science 2026-02-05 Zhaoyang Liu , Mokai Pan , Zhongyi Wang , Kaizhen Zhu , Haotao Lu , Haipeng Zhang , Jingya Wang , Ye Shi

Learning to Act Robustly with View-Invariant Latent Actions

Vision-based robotic policies often struggle with even minor viewpoint changes, underscoring the need for view-invariant visual representations. This challenge becomes more pronounced in real-world settings, where viewpoint variability is…

Robotics · Computer Science 2026-01-07 Youngjoon Jeong , Junha Chun , Taesup Kim

A Flexible Field-Based Policy Learning Framework for Diverse Robotic Systems and Sensors

We present a cross robot visuomotor learning framework that integrates diffusion policy based control with 3D semantic scene representations from D3Fields to enable category level generalization in manipulation. Its modular design supports…

Robotics · Computer Science 2025-12-23 Jose Gustavo Buenaventura Carreon , Floris Erich , Roman Mykhailyshyn , Tomohiro Motoda , Ryo Hanai , Yukiyasu Domae