English
Related papers

Related papers: DNAct: Diffusion Guided Multi-Task 3D Policy Learn…

200 papers

Modeling generalized robot control policies poses ongoing challenges for language-guided robot manipulation tasks. Existing methods often struggle to efficiently utilize cross-dataset resources or rely on resource-intensive vision-language…

Robotics · Computer Science 2024-11-05 Wenhui Tan , Bei Liu , Junbo Zhang , Ruihua Song , Jianlong Fu

This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. MDT leverages a…

Robotics · Computer Science 2024-07-09 Moritz Reuss , Ömer Erdinç Yağmurlu , Fabian Wenzel , Rudolf Lioutikov

Despite recent advances in dexterous manipulations, the manipulation of articulated objects and generalization across different categories remain significant challenges. To address these issues, we introduce DART, a novel framework that…

Robotics · Computer Science 2025-09-19 Hao Zhang , Zhen Kan , Weiwei Shang , Yongduan Song

Recent advances in deep learning have shown that learning robust feature representations is critical for the success of many computer vision tasks, including medical image segmentation. In particular, both transformer and…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 David Li , Anvar Kurmukov , Mikhail Goncharov , Roman Sokolov , Mikhail Belyaev

Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling…

Computation and Language · Computer Science 2023-11-30 Lihua Qian , Mingxuan Wang , Yang Liu , Hao Zhou

Acting in human environments is a crucial capability for general-purpose robots, necessitating a robust understanding of natural language and its application to physical tasks. This paper seeks to harness the capabilities of diffusion…

Robotics · Computer Science 2026-04-28 Jonas Bode , Raphael Memmesheimer , Sven Behnke

Recently, there has been an increased interest in the practical problem of learning multiple dense scene understanding tasks from partially annotated data, where each training sample is only labeled for a subset of the tasks. The missing of…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Hanrong Ye , Dan Xu

Recently, 2D speaking avatars have increasingly participated in everyday scenarios due to the fast development of facial animation techniques. However, most existing works neglect the explicit control of human bodies. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Jiazhi Guan , Quanwei Yang , Kaisiyuan Wang , Hang Zhou , Shengyi He , Zhiliang Xu , Haocheng Feng , Errui Ding , Jingdong Wang , Hongtao Xie , Youjian Zhao , Ziwei Liu

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Zhendong Wang , Yifan Jiang , Yadong Lu , Yelong Shen , Pengcheng He , Weizhu Chen , Zhangyang Wang , Mingyuan Zhou

Robust perception and dynamics modeling are fundamental to real-world robotic policy learning. Recent methods employ video diffusion models (VDMs) to enhance robotic policies, improving their understanding and modeling of the physical…

Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture…

Machine Learning · Computer Science 2026-03-31 Pengcheng Wang , Qinghang Liu , Haotian Lin , Yiheng Li , Guojian Zhan , Masayoshi Tomizuka , Yixiao Wang

Decision Transformer (DT), a trajectory modelling method, has shown competitive performance compared to traditional offline reinforcement learning (RL) approaches on various classic control tasks. However, it struggles to learn optimal…

Machine Learning · Computer Science 2025-09-18 Xingshuai Huang , Di Wu , Benoit Boulet

Recent research has highlighted the powerful capabilities of imitation learning in robotics. Leveraging generative models, particularly diffusion models, these approaches offer notable advantages such as strong multi-task generalization,…

Robotics · Computer Science 2025-09-15 Xinyao Qin , Xiaoteng Ma , Yang Qi , Qihan Liu , Chuanyi Xue , Ning Gui , Qinyu Dong , Jun Yang , Bin Liang

Learning visuomotor policy for multi-task robotic manipulation has been a long-standing challenge for the robotics community. The difficulty lies in the diversity of action space: typically, a goal can be accomplished in multiple ways,…

Robotics · Computer Science 2025-03-24 Kun Wu , Yichen Zhu , Jinming Li , Junjie Wen , Ning Liu , Zhiyuan Xu , Jian Tang

In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative…

Computer Vision and Pattern Recognition · Computer Science 2024-02-07 Yiming Xu , Hao Cheng , Monika Sester

Pre-trained large language models demonstrate potential in extracting information from DNA sequences, yet adapting to a variety of tasks and data modalities remains a challenge. To address this, we propose DNAGPT, a generalized DNA…

Genomics · Quantitative Biology 2023-09-01 Daoan Zhang , Weitong Zhang , Yu Zhao , Jianguo Zhang , Bing He , Chenchen Qin , Jianhua Yao

Coherent X-ray scattering techniques are critical for investigating the fundamental structural properties of materials at the nanoscale. While advancements have made these experiments more accessible, real-time analysis remains a…

Machine Learning · Computer Science 2025-07-21 Aileen Luo , Tao Zhou , Ming Du , Martin V. Holt , Andrej Singer , Mathew J. Cherukara

Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Changyou Chen , Han Ding , Bunyamin Sisman , Yi Xu , Ouye Xie , Benjamin Z. Yao , Son Dinh Tran , Belinda Zeng

Diffusion policies trained via offline behavioral cloning have recently gained traction in robotic motion generation. While effective, these policies typically require a large number of trainable parameters. This model size affords powerful…

Robotics · Computer Science 2025-04-29 Xiatao Sun , Shuo Yang , Yinxing Chen , Francis Fan , Yiyan Liang , Daniel Rakita

Diffusion policies are conditional diffusion models that learn robot action distributions conditioned on the robot and environment state. They have recently shown to outperform both deterministic and alternative action distribution learning…

Robotics · Computer Science 2024-07-26 Tsung-Wei Ke , Nikolaos Gkanatsios , Katerina Fragkiadaki
‹ Prev 1 2 3 10 Next ›