English
Related papers

Related papers: Implicit Contact Diffuser: Sequential Contact Reas…

200 papers

Diffusion models have shown excellent performance in text-to-image generation. Nevertheless, existing methods often suffer from performance bottlenecks when handling complex prompts that involve multiple objects, characteristics, and…

Computer Vision and Pattern Recognition · Computer Science 2025-05-07 Mingcheng Li , Xiaolu Hou , Ziyang Liu , Dingkang Yang , Ziyun Qian , Jiawei Chen , Jinjie Wei , Yue Jiang , Qingyao Xu , Lihua Zhang

Understanding how humans would behave during hand-object interaction is vital for applications in service robot manipulation and extended reality. To achieve this, some recent works have been proposed to simultaneously forecast hand…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Junyi Ma , Jingyi Xu , Xieyuanli Chen , Hesheng Wang

Decision-making in robotics using denoising diffusion processes has increasingly become a hot research topic, but end-to-end policies perform poorly in tasks with rich contact and have limited controllability. This paper proposes…

Robotics · Computer Science 2024-11-21 Dexin Wang , Chunsheng Liu , Faliang Chang , Yichen Xu

As robots become more integrated in society, their ability to coordinate with other robots and humans on multi-modal tasks (those with multiple valid solutions) is crucial. Such behaviors can be learned from expert demonstrations via…

Robotics · Computer Science 2026-05-15 Dayi Dong , Maulik Bhatt , Seoyeon Choi , Negar Mehr

Text-to-image diffusion models exhibit remarkable generative capabilities, yet their internal operations remain opaque, particularly when handling prompts that are not fully descriptive. In such scenarios, models must make implicit…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Katarzyna Zaleska , Łukasz Popek , Monika Wysoczańska , Kamil Deja

Image super-resolution (SR) has attracted increasing attention due to its wide applications. However, current SR methods generally suffer from over-smoothing and artifacts, and most work only with fixed magnifications. This paper introduces…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Sicheng Gao , Xuhui Liu , Bohan Zeng , Sheng Xu , Yanjing Li , Xiaoyan Luo , Jianzhuang Liu , Xiantong Zhen , Baochang Zhang

In this paper, a novel approach is proposed for learning robot control in contact-rich tasks such as wiping, by developing Diffusion Contact Model (DCM). Previous methods of learning such tasks relied on impedance control with time-varying…

Robotics · Computer Science 2024-03-21 Masashi Okada , Mayumi Komatsu , Tadahiro Taniguchi

Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. Beyond their rapidly growing ability to generate coherent natural language, these…

Computation and Language · Computer Science 2025-12-11 Michael Cardei , Jacob K Christopher , Thomas Hartvigsen , Bhavya Kailkhura , Ferdinando Fioretto

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Zhendong Wang , Yifan Jiang , Yadong Lu , Yelong Shen , Pengcheng He , Weizhu Chen , Zhangyang Wang , Mingyuan Zhou

Dexterous in-hand manipulation is an essential skill of production and life. However, the highly stiff and mutable nature of contacts limits real-time contact detection and inference, degrading the performance of model-based methods.…

Robotics · Computer Science 2024-11-08 Yongpeng Jiang , Mingrui Yu , Xinghao Zhu , Masayoshi Tomizuka , Xiang Li

Discrete diffusion models have emerged as a powerful class of models and a promising route to fast language generation, but practical implementations typically rely on factored reverse transitions ignoring cross-token dependencies and…

Machine Learning · Computer Science 2026-05-14 Dario Shariatian , Alain Durmus , Umut Simsekli , Stefano Peluchetti

Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token. This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost…

Discrete diffusion models form a powerful class of generative models across diverse domains, including text and graphs. However, existing approaches face fundamental limitations. Masked diffusion models suffer from irreversible errors due…

Machine Learning · Computer Science 2026-04-21 Marcel Kollovieh , Sirine Ayadi , Stephan Günnemann

Humans can accomplish complex contact-rich tasks using vision and touch, with highly reactive capabilities such as fast response to external changes and adaptive control of contact forces; however, this remains challenging for robots.…

Robotics · Computer Science 2025-04-24 Han Xue , Jieji Ren , Wendi Chen , Gu Zhang , Yuan Fang , Guoying Gu , Huazhe Xu , Cewu Lu

We present a general approach for controlling robotic systems that make and break contact with their environments. Contact-implicit model predictive control (CI-MPC) generalizes linear MPC to contact-rich settings by utilizing a bi-level…

Diffusion language models, especially masked discrete diffusion models, have achieved great success recently. While there are some theoretical and primary empirical results showing the advantages of latent reasoning with looped transformers…

Artificial Intelligence · Computer Science 2026-05-13 Cai Zhou , Chenxiao Yang , Yi Hu , Chenyu Wang , Chubin Zhang , Muhan Zhang , Lester Mackey , Tommi Jaakkola , Stephen Bates , Dinghuai Zhang

Manipulation of articulated and deformable objects can be difficult due to their compliant and under-actuated nature. Unexpected disturbances can cause the object to deviate from a predicted state, making it necessary to use…

Robotics · Computer Science 2024-03-21 Zixuan Huang , Yating Lin , Fan Yang , Dmitry Berenson

Graphic design visually conveys information and data by creating and combining text, images and graphics. Two-stage methods that rely primarily on layout generation lack creativity and intelligence, making graphic design still…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Yadong Qu , Shancheng Fang , Yuxin Wang , Xiaorui Wang , Zhineng Chen , Hongtao Xie , Yongdong Zhang

Acting in cluttered environments requires predicting and avoiding collisions while still achieving precise control. Conventional optimization-based controllers can enforce physical constraints, but they struggle to produce feasible…

Human-level contact-rich manipulation relies on the distinct roles of two key modalities: vision provides spatially rich but temporally slow global context, while force sensing captures rapid, high-frequency local contact dynamics.…

Robotics · Computer Science 2025-12-12 Wendi Chen , Han Xue , Yi Wang , Fangyuan Zhou , Jun Lv , Yang Jin , Shirun Tang , Chuan Wen , Cewu Lu
‹ Prev 1 2 3 10 Next ›