Related papers: Implicit Contact Diffuser: Sequential Contact Reas…

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation

Diffusion models have shown excellent performance in text-to-image generation. Nevertheless, existing methods often suffer from performance bottlenecks when handling complex prompts that involve multiple objects, characteristics, and…

Computer Vision and Pattern Recognition · Computer Science 2025-05-07 Mingcheng Li , Xiaolu Hou , Ziyang Liu , Dingkang Yang , Ziyun Qian , Jiawei Chen , Jinjie Wei , Yue Jiang , Qingyao Xu , Lihua Zhang

Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

Understanding how humans would behave during hand-object interaction is vital for applications in service robot manipulation and extended reality. To achieve this, some recent works have been proposed to simultaneously forecast hand…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Junyi Ma , Jingyi Xu , Xieyuanli Chen , Hesheng Wang

Hierarchical Diffusion Policy: manipulation trajectory generation via contact guidance

Decision-making in robotics using denoising diffusion processes has increasingly become a hot research topic, but end-to-end policies perform poorly in tasks with rich contact and have limited controllability. This paper proposes…

Robotics · Computer Science 2024-11-21 Dexin Wang , Chunsheng Liu , Faliang Chang , Yichen Xu

MIMIC-D: Multi-modal Imitation for MultI-agent Coordination with Decentralized Diffusion Policies

As robots become more integrated in society, their ability to coordinate with other robots and humans on multi-modal tasks (those with multiple valid solutions) is crucial. Such behaviors can be learned from expert demonstrations via…

Robotics · Computer Science 2026-05-15 Dayi Dong , Maulik Bhatt , Seoyeon Choi , Negar Mehr

Attention, May I Have Your Decision? Localizing Generative Choices in Diffusion Models

Text-to-image diffusion models exhibit remarkable generative capabilities, yet their internal operations remain opaque, particularly when handling prompts that are not fully descriptive. In such scenarios, models must make implicit…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Katarzyna Zaleska , Łukasz Popek , Monika Wysoczańska , Kamil Deja

Implicit Diffusion Models for Continuous Super-Resolution

Image super-resolution (SR) has attracted increasing attention due to its wide applications. However, current SR methods generally suffer from over-smoothing and artifacts, and most work only with fixed magnifications. This paper introduces…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Sicheng Gao , Xuhui Liu , Bohan Zeng , Sheng Xu , Yanjing Li , Xiaoyan Luo , Jianzhuang Liu , Xiantong Zhen , Baochang Zhang

A Contact Model based on Denoising Diffusion to Learn Variable Impedance Control for Contact-rich Manipulation

In this paper, a novel approach is proposed for learning robot control in contact-rich tasks such as wiping, by developing Diffusion Contact Model (DCM). Previous methods of learning such tasks relied on impedance control with time-varying…

Robotics · Computer Science 2024-03-21 Masashi Okada , Mayumi Komatsu , Tadahiro Taniguchi

Constrained Discrete Diffusion

Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. Beyond their rapidly growing ability to generate coherent natural language, these…

Computation and Language · Computer Science 2025-12-11 Michael Cardei , Jacob K Christopher , Thomas Hartvigsen , Bhavya Kailkhura , Ferdinando Fioretto

In-Context Learning Unlocked for Diffusion Models

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Zhendong Wang , Yifan Jiang , Yadong Lu , Yelong Shen , Pengcheng He , Weizhu Chen , Zhangyang Wang , Mingyuan Zhou

Contact-Implicit Model Predictive Control for Dexterous In-hand Manipulation: A Long-Horizon and Robust Approach

Dexterous in-hand manipulation is an essential skill of production and life. However, the highly stiff and mutable nature of contacts limits real-time contact detection and inference, degrading the performance of model-based methods.…

Robotics · Computer Science 2024-11-08 Yongpeng Jiang , Mingrui Yu , Xinghao Zhu , Masayoshi Tomizuka , Xiang Li

Latent-Augmented Discrete Diffusion Models

Discrete diffusion models have emerged as a powerful class of models and a promising route to fast language generation, but practical implementations typically rely on factored reverse transitions ignoring cross-token dependencies and…

Machine Learning · Computer Science 2026-05-14 Dario Shariatian , Alain Durmus , Umut Simsekli , Stefano Peluchetti

Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling

Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token. This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost…

Machine Learning · Statistics 2025-10-03 Huangjie Zheng , Shansan Gong , Ruixiang Zhang , Tianrong Chen , Jiatao Gu , Mingyuan Zhou , Navdeep Jaitly , Yizhe Zhang

Interpolating Discrete Diffusion Models with Controllable Resampling

Discrete diffusion models form a powerful class of generative models across diverse domains, including text and graphs. However, existing approaches face fundamental limitations. Masked diffusion models suffer from irreversible errors due…

Machine Learning · Computer Science 2026-04-21 Marcel Kollovieh , Sirine Ayadi , Stephan Günnemann

Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

Humans can accomplish complex contact-rich tasks using vision and touch, with highly reactive capabilities such as fast response to external changes and adaptive control of contact forces; however, this remains challenging for robots.…

Robotics · Computer Science 2025-04-24 Han Xue , Jieji Ren , Wendi Chen , Gu Zhang , Yuan Fang , Guoying Gu , Huazhe Xu , Cewu Lu

Fast Contact-Implicit Model-Predictive Control

We present a general approach for controlling robotic systems that make and break contact with their environments. Contact-implicit model predictive control (CI-MPC) generalizes linear MPC to contact-rich settings by utilizing a bi-level…

Robotics · Computer Science 2023-01-09 Simon Le Cleac'h , Taylor Howell , Shuo Yang , Chi-Yen Lee , John Zhang , Arun Bishop , Mac Schwager , Zachary Manchester

Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner

Diffusion language models, especially masked discrete diffusion models, have achieved great success recently. While there are some theoretical and primary empirical results showing the advantages of latent reasoning with looped transformers…

Artificial Intelligence · Computer Science 2026-05-13 Cai Zhou , Chenxiao Yang , Yi Hu , Chenyu Wang , Chubin Zhang , Muhan Zhang , Lester Mackey , Tommi Jaakkola , Stephen Bates , Dinghuai Zhang

Subgoal Diffuser: Coarse-to-fine Subgoal Generation to Guide Model Predictive Control for Robot Manipulation

Manipulation of articulated and deformable objects can be difficult due to their compliant and under-actuated nature. Unexpected disturbances can cause the object to deviate from a predicted state, making it necessary to use…

Robotics · Computer Science 2024-03-21 Zixuan Huang , Yating Lin , Fan Yang , Dmitry Berenson

IGD: Instructional Graphic Design with Multimodal Layer Generation

Graphic design visually conveys information and data by creating and combining text, images and graphics. Two-stage methods that rely primarily on layout generation lack creativity and intelligence, making graphic design still…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Yadong Qu , Shancheng Fang , Yuxin Wang , Xiaorui Wang , Zhineng Chen , Hongtao Xie , Yongdong Zhang

Warm-Starting Collision-Free Model Predictive Control With Object-Centric Diffusion

Acting in cluttered environments requires predicting and avoiding collisions while still achieving precise control. Conventional optimization-based controllers can enforce physical constraints, but they struggle to produce feasible…

Robotics · Computer Science 2026-01-22 Arthur Haffemayer , Alexandre Chapin , Armand Jordana , Krzysztof Wojciechowski , Florent Lamiraux , Nicolas Mansard , Vladimir Petrik

ImplicitRDP: An End-to-End Visual-Force Diffusion Policy with Structural Slow-Fast Learning

Human-level contact-rich manipulation relies on the distinct roles of two key modalities: vision provides spatially rich but temporally slow global context, while force sensing captures rapid, high-frequency local contact dynamics.…

Robotics · Computer Science 2025-12-12 Wendi Chen , Han Xue , Yi Wang , Fangyuan Zhou , Jun Lv , Yang Jin , Shirun Tang , Chuan Wen , Cewu Lu