Related papers: DiffusionBrowser: Interactive Diffusion Previews v…

Diffusion Models in Vision: A Survey

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward…

Computer Vision and Pattern Recognition · Computer Science 2025-01-17 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

Explaining generative diffusion models via visual analysis for interpretable decision-making process

Diffusion models have demonstrated remarkable performance in generation tasks. Nevertheless, explaining the diffusion process remains challenging due to it being a sequence of denoising noisy images that are difficult for experts to…

Computer Vision and Pattern Recognition · Computer Science 2024-02-19 Ji-Hoon Park , Yeong-Joon Ju , Seong-Whan Lee

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

One of the main drawback of diffusion models is the slow inference time for image generation. Among the most successful approaches to addressing this problem are distillation methods. However, these methods require considerable…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Senmao Li , Taihang Hu , Joost van de Weijer , Fahad Shahbaz Khan , Tao Liu , Linxuan Li , Shiqi Yang , Yaxing Wang , Ming-Ming Cheng , Jian Yang

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhengxiong Luo , Dayou Chen , Yingya Zhang , Yan Huang , Liang Wang , Yujun Shen , Deli Zhao , Jingren Zhou , Tieniu Tan

Lazy Diffusion Transformer for Interactive Image Editing

We introduce a novel diffusion transformer, LazyDiffusion, that generates partial image updates efficiently. Our approach targets interactive image editing applications in which, starting from a blank canvas or an image, a user specifies a…

Computer Vision and Pattern Recognition · Computer Science 2024-04-19 Yotam Nitzan , Zongze Wu , Richard Zhang , Eli Shechtman , Daniel Cohen-Or , Taesung Park , Michaël Gharbi

DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models

Understanding and modeling lighting effects are fundamental tasks in computer vision and graphics. Classic physically-based rendering (PBR) accurately simulates the light transport, but relies on precise scene representations--explicit 3D…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Ruofan Liang , Zan Gojcic , Huan Ling , Jacob Munkberg , Jon Hasselgren , Zhi-Hao Lin , Jun Gao , Alexander Keller , Nandita Vijaykumar , Sanja Fidler , Zian Wang

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation

Diffusion models currently achieve state-of-the-art performance for both conditional and unconditional image generation. However, so far, image diffusion models do not support tasks required for 3D understanding, such as view-consistent 3D…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Titas Anciukevičius , Zexiang Xu , Matthew Fisher , Paul Henderson , Hakan Bilen , Niloy J. Mitra , Paul Guerrero

Spectral Progressive Diffusion for Efficient Image and Video Generation

Diffusion models have been shown to implicitly generate visual content autoregressively in the frequency domain, where low-frequency components are generated earlier in the denoising process while high-frequency details emerge only in later…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Howard Xiao , Brian Chao , Lior Yariv , Gordon Wetzstein

Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation

Video generation using diffusion-based models is constrained by high computational costs due to the frame-wise iterative diffusion process. This work presents a Diffusion Reuse MOtion (Dr. Mo) network to accelerate latent video generation.…

Computer Vision and Pattern Recognition · Computer Science 2024-09-20 Chenyu Wang , Shuo Yan , Yixuan Chen , Yujiang Wang , Mingzhi Dong , Xiaochen Yang , Dongsheng Li , Robert P. Dick , Qin Lv , Fan Yang , Tun Lu , Ning Gu , Li Shang

Optical Diffusion Models for Image Generation

Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output,…

Optics · Physics 2024-11-01 Ilker Oguz , Niyazi Ulas Dinc , Mustafa Yildirim , Junjie Ke , Innfarn Yoo , Qifei Wang , Feng Yang , Christophe Moser , Demetri Psaltis

Accelerating Diffusion Decoders via Multi-Scale Sampling and One-Step Distillation

Image tokenization plays a central role in modern generative modeling by mapping visual inputs into compact representations that serve as an intermediate signal between pixels and generative models. Diffusion-based decoders have recently…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Chuhan Wang , Hao Chen

Diffusion Explorer: Interactive Exploration of Diffusion Models

Diffusion models have been central to the development of recent image, video, and even text generation systems. They posses striking geometric properties that can be faithfully portrayed in low-dimensional settings. However, existing…

Machine Learning · Computer Science 2025-07-08 Alec Helbling , Duen Horng Chau

Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video Generation

We investigate methods to reduce inference time and memory footprint in stable diffusion models by introducing lightweight decoders for both image and video synthesis. Traditional latent diffusion pipelines rely on large Variational…

Computer Vision and Pattern Recognition · Computer Science 2025-03-10 Alexey Buzovkin , Evgeny Shilov

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Beyond high-fidelity image synthesis, diffusion models have recently exhibited promising results in dense visual perception tasks. However, most existing work treats diffusion models as a standalone component for perception tasks, employing…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Shuhong Zheng , Zhipeng Bao , Ruoyu Zhao , Martial Hebert , Yu-Xiong Wang

Denoising Diffusion via Image-Based Rendering

Generating 3D scenes is a challenging open problem, which requires synthesizing plausible content that is fully consistent in 3D space. While recent methods such as neural radiance fields excel at view synthesis and 3D reconstruction, they…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Titas Anciukevičius , Fabian Manhardt , Federico Tombari , Paul Henderson

Collaborative Diffusion for Multi-Modal Face Generation and Editing

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

DiffusER: Discrete Diffusion via Edit-based Reconstruction

In text generation, models that generate text from scratch one token at a time are currently the dominant paradigm. Despite being performant, these models lack the ability to revise existing text, which limits their usability in many…

Computation and Language · Computer Science 2022-11-01 Machel Reid , Vincent J. Hellendoorn , Graham Neubig

StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

We introduce StreamDiffusion, a real-time diffusion pipeline designed for interactive image generation. Existing diffusion models are adept at creating images from text or image prompts, yet they often fall short in real-time interaction.…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Akio Kodaira , Chenfeng Xu , Toshiki Hazama , Takanori Yoshimoto , Kohei Ohno , Shogo Mitsuhori , Soichi Sugano , Hanying Cho , Zhijian Liu , Masayoshi Tomizuka , Kurt Keutzer

TransDiffuser: Diverse Trajectory Generation with Decorrelated Multi-modal Representation for End-to-end Autonomous Driving

In recent years, diffusion models have demonstrated remarkable potential across diverse domains, from vision generation to language modeling. Transferring its generative capabilities to modern end-to-end autonomous driving systems has also…

Robotics · Computer Science 2025-09-17 Xuefeng Jiang , Yuan Ma , Pengxiang Li , Leimeng Xu , Xin Wen , Kun Zhan , Zhongpu Xia , Peng Jia , Xianpeng Lang , Sheng Sun

Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios

Complex degradations like noise, blur, and low resolution are typical challenges in real world image fusion tasks, limiting the performance and practicality of existing methods. End to end neural network based approaches are generally…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Yu Shi , Yu Liu , Zhong-Cheng Wu , Juan Cheng , Huafeng Li , Xun Chen